<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Restful Interface for RDF Stream Processors</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Balduini</string-name>
          <email>marco.balduini@polimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuele Della Valle</string-name>
          <email>emanuele.dellavalle@polimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DEIB - Politecnico di Milano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This poster proposes a minimal, backward compatible and combinable restful interface for RDF Stream Engine. A number of RDF Stream Processors exists (e.g., CQELS [1], SPARQLstream [2], ETALIS/EP-SPARQL [3], Sparkwave [4], INSTANS [5] and C-SPARQL Engine [6]), but they do not talk each other. This hampers comparative evaluations: existing benchmark proposals [7, 8] had to create software adapters to test the various processors. In this condition, it is difficult to assess how much the benchmark results depend on the performances of the processors and how much on those of the adapters. Moreover, the lack of a shared protocol to transmit RDF streams hinders the combined usage of those processors. For instance, a user may want: a) to deploy SPARQLstream to natively process data streams1; b) to semantically enrich the resulting RDF streams using Sparkwave (or INSTANS); c) to aggregate the enriched streams in events using the C-SPARQL Engine (or CQELS); and d), finally, to detect complex events with ETALIS/EP-SPARQL. This poster proposes a restful interface for RDF Stream Processors that is: 1. minimal - more sophisticated interface can be envisioned, but in this attempt we would like to create a broad consensus, thus we avoid proposing controversial solutions. 2. backward compatible - we are reusing RDF and SPARQL standards wherever we can so to guarantee that adaptation of non-streaming clients for RDF and SPARQL is straight forward. 3. combinable - the proposed interface enforces that the output of a processor can serve as input to a processor (including the one that generates it). The remainder of the paper is organised as follows. Section 2 briefly presents the background required to understand the proposed interface. Section 3 proposes the interface. Section 4 shortly discusses two requirements that are not considered for this minimal proposal and how the interface can be extended to cover them. A proof of concept implementation of the proposed interfaces for the C-SPARQL engine is available for download at http://streamreasoning.org/ download.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>From a conceptual point of view, existing RDF Stream Processors are
homogenous. They define the notion of RDF Stream – an unbound list of tuples &lt; t, τ &gt;
where t is an RDF triple and τ is a non-decreasing timestamp –, and continuous
SPARQL query – a SPARQL query extended so that it can process RDF streams
using continuous operators (e.g., windows to logically convert a portion of the
infinite RDF stream in an RDF graph) and time-aware operators (e.g., sequence
to ask that a graph pattern is detected before another one).</p>
      <p>
        To the best of our knowledge, limited efforts was spent in defining a protocol
for: a) transmitting RDF stream across RDF Stream Processor on separated
machines, b) registering a continuous query in a processor, and c) observing
the continuously evolving results. The only existing solution are proprietary.
For instance, the C-SPARQL Engine is typically used within the Streaming
Linked Data framework [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Similarly, CQELS is paired with the Super Stream
Collider [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Services</title>
      <p>A community effort is needed to propose a continuous SPARQL extension that
can span across the existing proposals, but we believe this is the right time to
propose a restful interfaces that processors can easily implement.</p>
      <p>The following proposal specifies how to manage RDF streams, continuous
SPARQL queries, and observers of continuous results (see Table 1 for details).</p>
      <p>Complying to restful principles, users can register a new RDF stream σ in the
processor using the PUT method on the resource /streams/. As a result the RDF
stream /streams/σ becomes available in the processor. At this point, they can
stream information on the RDF stream POSTing an RDF graph to /streams/σ)
and they can unregister it using the DELETE method. The list of all registered
stream is returned when GETting the resource /streams/.</p>
      <p>
        It is worth to note that, learning from flexible time management in data
stream processors [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], we propose to avoid annotating the streamed RDF graphs
with a timestamp. This complies to the expected input of best effort data stream
processors (e.g. esper). We leave the annotation of the streamed RDF graphs with
application timestamp to a future extension of this minimal protocol. Moreover,
this design decision allows the proposal to be backward compatible. Any
Semantic Web application can send information to an RDF Stream Processor
simply posting an RDF graph.
      </p>
      <p>User can register a new continuous SPARQL query γ in the processor using
the PUT method on the resource /queries/. The proposed interface is agnostic
w.r.t. the language used to declare the query and leaves to the processor to
parse the query in the body of the request. Nonetheless, it requires the query to
refer only to RDF streams already registered in the processor. If the user tries
to register a query on streams that have not been registered, yet, the service
must refuse to register the query. If the registration is successful, the processor
starts the continuous execution of the query and the query /queries/γ appears in
the list of queries that can be retrieved GETting the resource /queries/. As for the
RDF streams, the query /queries/γ can be unregistered using the DELETE method.
The method POST on the resource /queries/γ is used to start observing the query
results, to pause the query and to restart it.</p>
      <p>Access to query results follows an observable-observer design pattern. In
order to start observing the results of a query γ, a user has to POST a callback
URL to /queries/γ. The created observer ω is identified by an URL of the form
/queries/γ/observers/ω. The user can stop observing the query by DELETing this
resource. Multiple observers per query are possible. Whenever γ computes new
results, the processor notifies all the observers by invoking the provided callback
URLs.</p>
      <p>If the query is of the forms SELECT or ASK, results must be formatted according
to SPARQL 1.1 query results2, thus allowing for backward compatibility with
existing SPARQL resultset parsers.</p>
      <p>If the query is of the forms CONSTRUCT or DESCRIBE, the processor must POST an
RDF graph containing the result. As a result our proposal is not only backward
compatibility – it is conform to SPARQL 1.1 result formats –, but it is also
combinable – the results of a query can be POSTed to another registered RDF
stream. The callback URL passed as parameter in starting an observer simply
has to be the URL of an existing RDF stream3.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>The proposal, being minimal, ignored important requirements w.r.t. time
modelling, access control, and transmission overhead.
2 Our implementation supports http://www.w3.org/TR/2013/REC-sparql11-results-json/
3 In order to avoid the overhead to stream on HTTP an RDF stream that is consumed
by the same processor, when a query γ of the forms CONSTRUCT or DESCRIBE is registered,
an RDF stream, whose identifier is /streams/γ, is automatically registered. The result
of the query γ is internally streamed on it.</p>
      <p>
        Adding the application time to the protocol is only a matter to POST a
timestamp together with the RDF graph. However, as explained in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], in the case
of multiple distributed sources POSTing to the same RDF stream, out-of-orders
can appear due to lack of clock synchronisation and different network delays. In
our future work, we will propose this extension and, at the same time, we will
release an open-source package that includes the management out-of-orders.
      </p>
      <p>The proposed interface lacks access control, but it is ready for HTTP-based
access control. An HTTP server, between the user and the Restful service
container, can handle access to /streams and /queries. Moreover, only the owner of
a query γ can start observing the results of γ or is allowed to list all the
observers (i.e., GETting /queries/γ/obsevers/). However, investigating OAuth-based
access-control is on our research agenda.</p>
      <p>Last, but not least, we recognise that the transmission overhead of the
proposed solution can reduce the processor throughput if the user frequently POSTs
RDF graphs containing only few triples. In our future work, we intent to explore
the streaming of RDF triples in N-quads format on a Web-socket.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Le-Phuoc</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dao-Tran</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Xavier</given-names>
            <surname>Parreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Hauswirth</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>A native and adaptive approach for unified processing of linked streams and linked data</article-title>
          .
          <source>In: ISWC</source>
          . (
          <year>2011</year>
          )
          <fpage>370</fpage>
          -
          <lpage>388</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Calbimonte</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corcho</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>A.J.G.</given-names>
          </string-name>
          :
          <article-title>Enabling ontology-based access to streaming data sources</article-title>
          .
          <source>In: ISWC</source>
          . (
          <year>2010</year>
          )
          <fpage>96</fpage>
          -
          <lpage>111</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Anicic</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fodor</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudolph</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stojanovic</surname>
          </string-name>
          , N.:
          <article-title>EP-SPARQL: a unified language for event processing and stream reasoning</article-title>
          . In: WWW. (
          <year>2011</year>
          )
          <fpage>635</fpage>
          -
          <lpage>644</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Komazec</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cerri</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fensel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Sparkwave: continuous schema-enhanced pattern matching over RDF data streams</article-title>
          .
          <source>In: DEBS</source>
          . (
          <year>2012</year>
          )
          <fpage>58</fpage>
          -
          <lpage>68</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Rinne</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nuutila</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Törmä</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Instans: High-performance event processing with standard rdf and sparql</article-title>
          .
          <source>In: ISWC (Posters &amp; Demos)</source>
          . (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Barbieri</surname>
            ,
            <given-names>D.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Braga</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Della Valle</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grossniklaus</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Querying rdf streams with c-sparql</article-title>
          .
          <source>SIGMOD Record</source>
          <volume>39</volume>
          (
          <issue>1</issue>
          ) (
          <year>2010</year>
          )
          <fpage>20</fpage>
          -
          <lpage>26</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Zhang,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Duc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Corcho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Calbimonte</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.P.:</surname>
          </string-name>
          <article-title>SRBench: A Streaming RDF/SPARQL Benchmark</article-title>
          . In: ISWC. (
          <year>2012</year>
          )
          <fpage>641</fpage>
          -
          <lpage>657</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Le-Phuoc</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dao-Tran</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pham</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boncz</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eiter</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fink</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Linked stream data processing engines: Facts and figures</article-title>
          . In: ISWC. (
          <year>2012</year>
          )
          <fpage>300</fpage>
          -
          <lpage>312</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Balduini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Celino</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dell'Aglio</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valle</surname>
            ,
            <given-names>E.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , il Lee,
          <string-name>
            <given-names>T.K.</given-names>
            ,
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.H.</given-names>
            ,
            <surname>Tresp</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          :
          <article-title>Bottari: An augmented reality mobile application to deliver personalized and location-based recommendations by continuous analysis of social media streams</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>16</volume>
          (
          <year>2012</year>
          )
          <fpage>33</fpage>
          -
          <lpage>41</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Quoc</surname>
            ,
            <given-names>H.N.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Serrano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le-Phuoc</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hauswirth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Super stream colliderlinked stream mashups for everyone</article-title>
          .
          <source>In: Proceedings of the Semantic Web Challenge co-located with ISWC2012</source>
          , Boston, MA, US (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Widom</surname>
          </string-name>
          , J.:
          <article-title>Flexible time management in data stream systems</article-title>
          . In: PODS, New York, New York, USA (
          <year>2004</year>
          )
          <fpage>263</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>