=Paper= {{Paper |id=None |storemode=property |title=INSTANS: High-Performance Event Processing with Standard RDF and SPARQL |pdfUrl=https://ceur-ws.org/Vol-914/paper_22.pdf |volume=Vol-914 |dblpUrl=https://dblp.org/rec/conf/semweb/RinneNT12 }} ==INSTANS: High-Performance Event Processing with Standard RDF and SPARQL== https://ceur-ws.org/Vol-914/paper_22.pdf
 INSTANS: High-Performance Event Processing
      with Standard RDF and SPARQL

                 Mikko Rinne, Esko Nuutila, and Seppo Törmä

                Department of Computer Science and Engineering,
                   Aalto University, School of Science, Finland
                         firstname.lastname@aalto.fi



      Abstract. Smart environments require collaboration of multi-platform
      sensors operated by multiple parties. Proprietary event processing solu-
      tions lack interoperation flexibility, leading to overlapping functions that
      can waste hardware and communication resources. Our goal is to show
      the applicability of standard RDF and SPARQL – including SPARQL
      1.1 Update – for complex event processing tasks. If found feasible, event
      processing would enjoy the benefits of semantic web technologies: cross-
      domain interoperability, flexible representation and query capabilities,
      interrelating disjoint vocabularies, reasoning over event content, and en-
      riching events with linked data. To enable event processing with standard
      RDF/SPARQL we have created Instans, a high-performance Rete-based
      platform for continuous execution of interconnected SPARQL queries.

      Keywords: Rete, SPARQL, RDF, Complex event processing


1   Introduction
Complex event processing is currently more dominated by proprietary systems
and vertical products than open technologies. In the future, however, internet-
connected people and things moving between smart spaces in smart cities will
create a huge volume of events in a multi-actor, multi-platform environment.
    Semantic web technologies enable flexible representation of events in RDF
and advanced specification of event patterns with SPARQL. They provide possi-
bilities to reason about event content and to enrich events with linked open data
available in the web. Semantic web standards have clear potential to improve
the interoperability and offer new capabilities in complex event processing.
    A major event processing application can hardly be created out of a single
SPARQL query. The INSERT operation in SPARQL 1.1 Update introduced a
critical new property: By inserting data into a graph, collaborating SPARQL
queries can store intermediate results and communicate with each other. On an
environment supporting simultaneous, continuous evaluation of multiple queries,
SPARQL can be used to create entire event processing applications [6].
    After finding no other platform for incremental processing of multiple SPARQL
1.1 queries, we created Instans. Based on the tried and tested Rete-algorithm
[3], Instans shares equivalent parts of queries, caches intermediate matches
and provides results immediately, when all the conditions of a query have been
matched. In addition to being competitive in SPARQL query processing [1], our
studies show qualitative and quantitative benefits compared to SPARQL-based
systems using repeated execution of queries over windows on event streams [6].
   Here we extend the discussion in [5] by adding further information on the
Instans implementation of continuous incremental SPARQL query processing.


2     INSTANS Event Processing Platform




                              Fig. 1: Instans Structure
     Instans1 [6] is an incremental engine for near-real-time processing of com-
plex, layered, heterogeneous events. Based on the Rete-algorithm [3], Instans
performs continuous evaluation of incoming RDF data against multiple SPARQL
queries. Intermediate results are stored into a β-node network. When all the con-
ditions of a query are matched, the result is instantly available.
     The structure of Instans is illustrated in Fig. 1. The system consists of the
Rete engine and the input and output connectors, which can interface with the
network, triple stores, files or other processes. The Rete engine has four compo-
nents: 1. Rete network, 2. α-matcher, 3. Rule instance queue, 4. Instance execu-
tor. The α-matcher and the Rete network are capable of finding all SPARQL rule
conditions satisfying the current set of triples. During runtime the α-matcher re-
ceives commands to add and remove triples. The matcher finds the α-nodes of
the Rete that match the triples and calls the add or remove methods of those
nodes. The changes propagate through the β-network and eventually fully sat-
isfied rule conditions enter the rule nodes, which add new rule instances (with
1
    Incremental eNgine for STANding Sparql, http://cse.aalto.fi/instans/
variable bindings) to the rule instance queue. The instance executor executes the
rule instances, which causes add and remove triple commands to be fed into the
output connectors. The rule instance execution also feeds add and remove triple
commands to the α-matcher, resulting in new rule instances. Instans operation
over an example query is illustrated in Fig. 2. The query selects events occurring
between 10 and 11 am. The asynchronous nature of Instans means that all input

                                                                                            1	
                                   3	
                                5	
  
             !1                              "1: ! a event:event                                    "2: ! event:time !                         "3: ! tl:at !


                                                                           ?event


                                                           Y1


                                                                                   ?event                                  ?event, ?time

                                                                          2	
                         :e1	
  
                                                                                       !2

      Query:	
                                                                                      ?event                                               ?time, ?daytime
      	
  
      SELECT	
  ?event	
  
      WHERE	
  {	
  
                                                                                                          Y2
      	
  	
  ?event	
  a	
  event:Event	
  ;	
  
      	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  event:7me	
  ?7me	
  .	
  
      	
  	
  ?7me	
  tl:at	
  ?day7me	
  .	
  
                                                                                                                  ?event, ?time
      	
  	
  FILTER	
  (	
  hours(?day7me)	
  =	
  10	
  )	
  	
  }	
  
                                                                                                          4	
  
                                                                                                                  !3                  :e1	
  _:b1	
  
      Process	
  flow:	
  
      	
  
                                                                                                                               ?event, ?time
      ① Each	
  condi7on	
  corresponds	
  to	
  an	
  α-­‐node.	
  α1	
  matches	
  
           with	
  sample	
  input	
  “:e1	
  a	
  event:Event”.	
  
      ② 	
  “:e1”	
  propagates	
  to	
  β2	
  and	
  is	
  stored	
  there.	
  
                                                                                                                       6	
  
                                                                                                                                          Y3
      ③ 	
  α2	
  matches	
  with	
  “:e1	
  event:,me	
  _:b1”,	
  where	
  “_:b1”	
  
           is	
  a	
  blank	
  node.	
  Input	
  from	
  β2	
  matches	
  with	
  “?event”	
                                                            Drop	
  _:b1	
  
           in	
  Y2.	
  
                                                                                                                                           ?event, ?daytime
      ④ 	
  “:e1”	
  and	
  “_:b1”	
  propagate	
  un7l	
  β3.	
  
      ⑤ 	
  α3	
  matches	
  with	
  input	
  “_:b1	
  tl:at	
  
           “2011-­‐10-­‐03T10:05:00”ˆˆxsd:dateTime”.	
                                                                 7	
  
      ⑥ In	
  Y3	
  “_:b1”	
  is	
  equal	
  in	
  both	
  incoming	
  branches	
  and	
  
                                                                                                                                    filter1
           can	
  be	
  eliminated.	
  
                                                                                                                                                        :e1	
  10:05	
  
      ⑦ 	
  “:e1”	
  and	
  “2011-­‐10-­‐03T10:05:00”ˆˆxsd:dateTime	
  
                                                                                                                                           ?event
           reach	
  filter1.	
  The	
  condi7on	
  “hour	
  =	
  10”	
  is	
  true.	
  
      ⑧ 	
  “:e1”	
  is	
  selected	
  as	
  a	
  result.	
                                                            8	
  
                                                                                                                                    select1


                  Fig. 2: Example of SPARQL query processing in a Rete-net

is processed when it arrives. To manage periodic actions and missing events, the
concept of timed events is introduced [7]. When a new timer is started, an actor
is used to schedule wakeup, at which time a predicate of the timer is changed. A
SPARQL query matching such a triple reacts to the change and carries out the
defined actions. No extensions to SPARQL are needed to support timed events.
    Performance of Instans in terms of notification delay was compared to C-
SPARQL [2] using an example application described in [6]. Instans yielded
average notification delays of 12 ms on a 2.26 GHz Intel Core 2 Duo Mac. In C-
SPARQL average query processing delay varied between 12 - 253 ms for window
sizes of 5-60 events, respectively, resulting in the window repetition rate being
the dominant component of the notification delay for any window repetition
rate longer than a second. Using repetition rates of 5-60 seconds with 1 event
per second inter-arrival time C-SPARQL notification delay was measured at
1.34-25.90 seconds. Further details are available on the Instans project website.
Comparison with CQELS [4] is waiting for the availability of a generic version.

3    Conclusions
The feasibility of the central paradigm of Instans – continuous incremental
matching of multiple SPARQL queries supporting inter-query communication –
has so far been supported by empirical tests. When complemented with support
for timed events, we have found no showstopper problems which would render
the approach unusable for any complex event processing task.
    The performance of Instans is higher compared to systems based on re-
peated execution of queries at fixed time intervals (or triple counts); they cannot
practically compete with Instans whose notification delays are in the order of
milliseconds. Instans avoids redundant computation: each event is processed
immediately on arrival and only once through the Rete network, network struc-
tures are shared across similar queries, and intermediate results are memorized.

References
1. Abdullah, H., Rinne, M., Törmä, S., Nuutila, E.: Efficient matching of SPARQL
   subscriptions using Rete. In: Proceedings of the 27th Symposium On Applied Com-
   puting (Mar 2012)
2. Barbieri, D.F., Braga, D., Ceri, S., Grossniklaus, M.: An execution environment
   for C-SPARQL queries. In: Proceedings of the 13th International Conference on
   Extending Database Technology - EDBT ’10. p. 441. Lausanne, Switzerland (2010)
3. Forgy, C.L.: Rete: A fast algorithm for the many pattern/many object pattern match
   problem. Artificial Intelligence 19(1), 17–37 (Sep 1982)
4. Le-Phuoc, D., Dao-Tran, M., Parreira, J.X., Hauswirth, M.: A native and adaptive
   approach for unified processing of linked streams and linked data. In: ISWC’11. pp.
   370–388. Springer-Verlag Berlin (Oct 2011)
5. Rinne, M.: DC Short Paper: SPARQL Update for Complex Event Processing. In:
   ISWC 2012. Springer-Verlag, Boston, MA (2012)
6. Rinne, M., Abdullah, H., Törmä, S., Nuutila, E.: Processing Heterogeneous RDF
   Events with Standing SPARQL Update Rules. In: Meersman, R., Dillon, T. (eds.)
   OTM 2012 Conferences, Part II. pp. 793–802. Springer-Verlag (2012)
7. Rinne, M., Törmä, S., Nuutila, E.: SPARQL-Based Applications for RDF-Encoded
   Sensor Data. In: 5th International Workshop on Semantic Sensor Networks (2012)