=Paper= {{Paper |id=Vol-520/paper-3 |storemode=property |title=Continuous Queries and Real-time Analysis of Social Semantic Data with C-SPARQL |pdfUrl=https://ceur-ws.org/Vol-520/paper02.pdf |volume=Vol-520 }} ==Continuous Queries and Real-time Analysis of Social Semantic Data with C-SPARQL== https://ceur-ws.org/Vol-520/paper02.pdf

Continuous Queries and Real-time Analysis of
Social Semantic Data with C-SPARQL

Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle,
and Michael Grossniklaus

Dipartimento di Elettronica e Informazione, Politecnico di Milano
Piazza Leonardo da Vinci 32, I-20133 Milano, Italy
{dbarbieri|braga|ceri|dellavalle|grossniklaus}@elet.polimi.it

Abstract. Social semantic data are becoming a reality, but apparently
their streaming nature has been ignored so far. Streams, being unboun-
ded sequences of time-varying data elements, should not be treated as
persistent data to be stored “forever” and queried on demand, but rather
as transient data to be consumed on the fly by queries which are regis-
tered once and for all and keep analyzing such streams, producing an-
swers triggered by the streaming data and not by explicit invocation.
In this paper, we propose an approach to continuous queries and real-
time analysis of social semantic data with C-SPARQL, an extension of
SPARQL for querying RDF streams.

Keywords: Continuous SPARQL, Social Semantic Data, Continuous
Query, Real-time Analysis

1 Introduction and Motivation
“Which are the hottest topics under discussion on Twitter?” “Who is discussing
about Italian food right now?” “What have my close friends been discussing in
the last hour?” “Who is now discussing about Tuscany red wines in my social
network?” “How many people have been twittering about northern Italy white
wines in the last three hours?”
The information required to answer those queries is increasingly becoming
available on the Web. On the one side, we observe a trend in Web 2.0 as blogs,
feeds and microblogs are adopted to disseminate and publish information in
real-time streams trough social networking Web sites. This trend is often re-
ferred to as the Twitter phenomenon or, in more broader terms, as the so-called
blogosphere. On the other side, a trend can be also observed towards the inter-
linking of Social Web with semantics [1] using vocabularies such as Semantically-
Interlinked Online Communities1 (SIOC), Friend-of-a-Friend2 (FOAF) and Sim-
ple Knowledge Organization System3 (SKOS).
1
http://rdfs.org/sioc/spec/
2
http://xmlns.com/foaf/spec/
3
www.w3.org/2004/02/skos/
2 D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus

As concrete examples, we can refer to three pioneers in this field

– SMOB [2] is a distributed and decentralized microblogging system built on
SIOC and FOAF, for which the authors implemented both a publishing and
an aggregating service prototype.
– Smesher [3] is a Semantic Microblogging client that integrates Twitter and
identi.ca, detects structure in microposts, extracts content in RDF and allows
SPARQL queries over the extracted information.
– SemanticTweet4 is a simple Web service that generates a FOAF RDF docu-
ment from the list of Twitter friends and followers of any Twitter user using
the Twitter REST API.

Several attempts to provide answers to those questions exist both by em-
ploying Information Retrieval (IR) methods (e.g., technorati.com, icerocket.com,
blogsearchengine.com, blogsearch.google.com) and Semantic Web methods (e.g.,
the SIOC API of sindice.com, SMOB aggregator and Smesher). Our claim is that
answering such questions in the space of one-time semantics of current IR and
Semantic Web tools is difficult because the underlying data are streams.
Data streams are unbounded sequences of time-varying data elements. They
have been recognized in a variety of modern applications, such as network mon-
itoring, traffic engineering, sensor networks, RFID tag applications, telecom call
records and financial applications. Processing of data streams has been largely
investigated in the last decade [4], specialized Data Stream Management Sys-
tems (DSMS) have been developed, and features of DSMS are becoming sup-
ported by major database products, such as Oracle and DB2.
DSMS represent a paradigm change in the database world as they move from
persistent relations and user-invoked queries to transient streams and continuous
queries. The innovative assumption is that streams can be consumed on the fly
rather than being stored forever and that queries which are persistently monitor-
ing streams are able to produce their answers even in the absence of invocation.
DSMS can support parallel query answering over data originating in real-time
and can cope with bursts of data by adapting their behavior to gracefully de-
grade answer accuracy by introducing higher levels of approximation. However,
even if such DSMS systems proved to be an optimal solution for on the fly anal-
ysis of data streams, they cannot perform complex reasoning tasks, such as the
ones required for computing the answers to the above questions.
At the same time, while Semantic Web reasoners are year after year scaling up
in the classical, time invariant domain of RDF triples and ontological knowledge,
reasoning upon rapidly changing information has been neglected or forgotten so
far. Reasoning systems assume static knowledge, and do not manage “changing
worlds”—at most, one can update the ontological knowledge and then repeat
the reasoning tasks.
In [5], we propose Stream Reasoning as the new multi-disciplinary approach
which will provide the abstractions, foundations, methods, and tools required to
4
http://semantictweet.com/
Continuous Queries and Analysis of Social Semantic Data with C-SPARQL 3

integrate data streams and reasoning systems, thus giving answer to the above
and other questions from different domains.
The rest of the paper is organized as follows. In Section 2, we provide the
background needed to understand the proposed extensions to RDF and SPARQL
introduced in Sections 3.1 and 3.3, respectively. In Section 4, we describe an
architecture and implementation of a C-SPARQL engine. Section 5 is dedicated
to the comparison of C-SPARQL to SPARQL using the real-world social data
streams described in Section 3.2. We close the paper by discussing related work
in Section 6 and draw conclusions in Section 7.

2 Background

This section illustrates previous work on data streams and the SPARQL lan-
guage.

2.1 Data Streams

DSMS are based on the observation that not only is it impossible to control the
order in which data items arrive in a stream, but, even more importantly, it is
not feasible to locally store a stream in its entirety [6].
The Chronicle data model [7] is one of the first models proposed for data
streams. It introduced the concept of chronicles, append-only ordered sequences
of tuples, as well as a restricted view definition language and an algebra that op-
erates both over chronicles and traditional relations. OpenCQ [8], NiagaraCQ [9]
and Aurora [10] are representative implementations of DSMS addressing contin-
uous queries and distribution issues.
The first query language tailored to data streams, CQL [11, 12], was the result
of research done by Babu et al. [13] on the problem of continuous queries over
data streams, addressing semantic issues as well as efficiency concerns. They
specify a general and flexible architecture for query processing in the presence
of data streams.
More recently, Law et al. [14] put particular emphasis on the problem of
mining data streams [15]. They conceived and developed Stream Mill [16], which
considers and addresses data mining issues extensively, specifically with respect
to the problem of online data aggregation and to the distinguishing notion of
blocking and non-blocking operators. Its query language (ESL) efficiently sup-
ports physical and logical windows (with optional slides and tumbles) on both
built-in aggregates and user-defined aggregates. The constructs introduced in
ESL extend the power and generality of DSMS.

2.2 SPARQL

SPARQL has been developed under the patronage of the W3C as the standard
query language for RDF. Therefore, the most authoritative source on its syntax
4 D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus

and semantics is the W3C recommendation [17]. Several papers, however, discuss
extensions to the SPARQL language as defined by the W3C.
With the goal of proposing a syntactic and semantic extension to SPARQL,
we found the following works particularly useful.

– Gutierrez et al. [18], who define a conjunctive query language for RDF with
basic patterns, which is a formal and unambiguous basis for defining the
semantics of SPARQL queries evaluation;
– Perez et al. [19], who analyze the semantics and complexity of SPARQL; and
– Cyganiak [20] and Haase et al. [21], who independently present a relational
model of SPARQL which allows to implement SPARQL queries over a rela-
tional database engine.

3 RDF Streams and Continuous SPARQL

Data models and query languages for DSMS are not sufficient for continuously
querying and analyzing in real-time streams of RDF. Indeed, we deem that there
is a potential interest for giving up one-time semantics in RDF repositories as
well, so as to explore the benefits provided by continuous semantics. Therefore,
we introduce RDF streams as the natural extension of the RDF data model to
the new continuous scenario and Continuous SPARQL (or simply C-SPARQL)
as the extension of SPARQL for querying RDF streams.

3.1 RDF Streams

An RDF stream is defined as an ordered sequence of pairs, where each pair is
constituted by an RDF triple and its timestamp τ .
...
(hsubji , predi , obji i , τi )
(hsubji+1 , predi+1 , obji+1 i , τi+1 )
...
Timestamps can be considered the context of RDF triples. They are monoton-
ically non-decreasing in the stream (τi ≤ τi+1 ). They are not strictly increasing
because timestamps are not required to be unique. Any (unbounded, though fi-
nite) number of consecutive triples can have the same timestamp, meaning that
they occur at the same time, although sequenced in the stream according to
some positional order. Our definition of RDF streams extends RDF in the same
way as the stream type in CQL extends the relation type.
Named graphs [22] and N-Quads [23], a format that extends N-Triples with
context, can be both adopted as a concrete serialization for RDF streams. For our
experiments we adopt N-Quads and we use as context the timestamp encoded
as a RDF literal of type xsd:dateTime.
Continuous Queries and Analysis of Social Semantic Data with C-SPARQL 5

3.2 A Real Source of Social Semantic Data Streams

Given that RDF streams of social semantic data are not readily available yet, we
decided to use the data provided by Social Network Glue5 . Glue enables users to
connect with their friends on the Web based on the pages the users visit online.
Using semantic recognition technologies to automatically identify books, music,
movies, wines, stocks, movie stars and many other similar topics, it generates a
continuous stream of the identified objects which is accessible in real time us-
ing a REST API6 . The REST request “http://api.getglue.com/v1/glue/recent”
returns the 250 most recent public interactions. We adopted the GRDDL ap-
proach [24] and implemented a simple way to translate the resulting XML into
RDF. A live version is running at http://c-sparql.cefriel.it/sdow-demo. Below,
we provide a snapshot of the resulting RDF stream.

Subject Predicate Object Timestamp
glueinter:i1 rdf:type sioc:Post “2009-07-20T22:48:52Z”
glueinter:i1 sioc:content “The Proposal on imdb.com” “2009-07-20T22:48:52Z”
glueinter:i1 sioc:has container http://www.getglue.com “2009-07-20T22:48:52Z”
glueinter:i1 dc:title “The Proposal” “2009-07-20T22:48:52Z”
glueuser:id1 rdf:type sioc:User “2009-07-20T22:48:52Z”
glueinter:i1 sioc:has creator glueuser:id1 “2009-07-20T22:48:52Z”
glueinter:i1 sioc:topic gluecat:movies “2009-07-20T22:48:52Z”
glueinter:i2 rdf:type sioc:Post “2009-07-20T22:48:54Z”
glueinter:i2 sioc:content “Mario Kart Wii on gamefaqs.com” “2009-07-20T22:48:55Z”
glueinter:i2 sioc:has container http://www.getglue.com “2009-07-20T22:48:55Z”
glueinter:i2 dc:title “Mario Kart Wii” “2009-07-20T22:48:55Z”
glueuser:id2 rdf:type sioc:User “2009-07-20T22:48:55Z”
glueinter:i2 sioc:has creator glueuser:id2 “2009-07-20T22:48:55Z”
glueinter:i2 sioc:topic glueint:video games “2009-07-20T22:48:55Z”

Similarly to SemanticTweet, we are also able to translate the social relation-
ships obtained using the REST service “/user/friends” into FOAF.

3.3 Continuous SPARQL

C-SPARQL is an extension of SPARQL for querying both RDF graphs and
RDF streams. The complete definition of the language in terms of syntax and
semantics is given in [25]. We briefly repeat the definitions of the distinguishing
features of the language here and show how to write the queries that evaluate
the answers to the questions which opened this paper.

Continuous Queries The distinguishing feature of C-SPARQL is the support
for continuous queries, i.e. queries that are registered and then executed con-
tinuously over windows opened on RDF streams and standard RDF graphs.
Continuous queries, which make usage of aggregates, are particularly relevant.
A C-SPARQL query is registered using the grammar extension provided by
the first of the following two production rules.
5
http://getglue.com
6
http://getglue.com/api
6 D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus

Registration → ‘REGISTER QUERY’ QueryName ‘AS’ Query
Registration → ‘REGISTER STREAM’ QueryName ‘AS’ Query

As output C-SPARQL queries produce the same types as SPARQL queries:
boolean answers, selections of variable bindings, RDF descriptions of the involved
resources or constructions of new RDF triples. These outputs are continuously
renewed in each query execution. In addition, C-SPARQL queries can be regis-
tered to produce new RDF streams using the grammar extension provided by
the second production rule given above. In this second case, only CONSTRUCT
and DESCRIBE queries can be registered, as they produce RDF triples that, once
associated with a timestamp, yield RDF streams that can be managed in C-
SPARQL.

Windows Given that RDF streams are intrinsically infinite, we introduce the
notion of windows upon RDF streams, whose types and characteristics are in-
spired by those of the windows in continuous query languages such as CQL [12].
Identification and windowing are expressed in C-SPARQL by means of the FROM
STREAM clause.

FromStrClause → ‘FROM’ [‘NAMED’] ‘STREAM’ StreamIRI ‘[ RANGE’ Window ‘]’

From the RDF stream identified by StreamIRI, a window extracts the last
triples, which are considered by the query. The extraction can be physical (a
given number of triples) or logical (a variable number of triples which occur
during a given time interval).
The part of C-SPARQL that we introduced so far is sufficient to address the
question “What have my closest friends been visiting in the last hour?” Below,
we show how to formulate this query in C-SPARQL over the Glue interaction
stream and the graph of FOAF relationships described in Section 3.2.
REGISTER QUERY WhatHaveMyCloseFriendsBeenVisitingInTheLastHour AS
PREFIX sioc:
PREFIX foaf:
PREFIX glue:
SELECT DISTINCT ?friend ?topic
FROM
FROM STREAM
[ RANGE 60m STEP 5m ]
WHERE { glue:id1 foaf:knows ?friend .
?post sioc:has_creator ?friend .
?post rdf:type sioc:Post .
?post sioc:topic ?topic . }

The first triple pattern matches triples in the FOAF graph, whereas the other
three triple patterns match triples in a sliding window of 60 minutes opened on
the RDF stream, which advances progressively in steps of 5 minutes.

Aggregation Another question that we could answer using a C-SPARQL is
“Which are the top topics in Glue?” To do so, we also need to introduce the
Continuous Queries and Analysis of Social Semantic Data with C-SPARQL 7

aggregation capabilities that we added to C-SPARQL. We allow multiple in-
dependent aggregations within the same C-SPARQL query, thus pushing the
aggregation capabilities beyond those of SQL and other proposals for aggrega-
tion in SPARQL7 . Aggregation clauses have the following syntax.

AggregateClause → ( ‘AGGREGATE { (’ var ‘,’ Fun ‘,’ Group ‘)’ [Filter ] ‘}’ )*
Fun → ‘COUNT’ | ‘SUM’ | ‘AVG’ | ‘MIN’ | ‘MAX’
Group → var | ‘{’ var ( ‘,’ var )* ‘}’

An aggregation clause starts with a new variable not occurring in the WHERE
clause, followed by an aggregation function and closed by a set of one or more
variables, occurring in the WHERE clause, which express the grouping criteria. For
instance, the query above can be expressed in C-SPARQL as follows.
REGISTER QUERY TopTopicsGlueUsersAreInterestedIn AS
PREFIX sioc:
SELECT DISTINCT ?topic ?number
FROM STREAM
[ RANGE 30m STEP 10m ]
WHERE { ?post sioc:topic ?topic . }
AGGREGATE {(?number , COUNT , {?topic })}

4 A C-SPARQL Engine

The C-SPARQL engine was designed based on a separation of concerns between
stream management and query evaluation. This separation is the foundation for
a simple architecture for C-SPARQL, built upon known database and reason-
ing technolgies. Figure 1 shows the three main components of our C-SPARQL
execution framework.
The module named C-SPARQL Query Parser gets a C-SPARQL query as
input and produces the information needed by the Data Stream Manager Layer
and the SPARQL EndPoint Layer to execute the query. The data stream man-
ager registers the data streams specified in the query and applies logical or
physical windows. When the resulting graph has been produced, the SPARQL
part of the C-SPARQL query is executed by the SPARQL endpoint. This process
is executed as frequently as specified in the REGISTER clause of the C-SPARQL
query. Finally, the result computed is timestamped and passed on. Both the data
stream manager and the SPARQL endpoint are considered plugins, in order to
be independent from the actual DSMS/SPARQL engine implementations that
will be used. We have implemented a prototype based on this architecture using
ESPER as a DSMS and Jena as a SPARQL endpoint.
ESPER8 is a component for stream processing applications, which require
high throughput to process large volumes of data elements (between 1,000 to
100k messages per second) and low latency to react in real-time (from a few
7
http://esw.w3.org/topic/SPARQL/Extensions/Aggregates
8
http://esper.codehaus.org/
8 D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus

C-SPARQL
Query Parser

Continuous part
Standard
REGISTER SPARQL Query
FROM STREAM

Data Stream SPARQL
Manager Layer EndPoint Layer

Fig. 1. C-SPARQL Engine Architecture

milliseconds to a few seconds). In particular, it supports the various forms of
windows we defined in C-SPARQL (i.e., sliding and tumbling windows both in
terms of time and length) and several forms of aggregation we plan to exploit
for optimizations based on query rewriting.
Jena9 is a Java framework for building Semantic Web applications. We choose
it, because it includes a custom RDF storage engine for high performance appli-
cations and a SPARQL query engine. The SPARQL engine supports standard
SPARQL and aggregation, GROUP BY and assignment as SPARQL extensions.
The adoption of off-the-shelf stream management systems and reasoning tools
both provide a solid framework and a fast way of prototyping.

5 Evaluation
In order to evaluate, our approach we compared the time required to compute a
C-SPARQL query with our engine, to the time needed to execute an equivalent
SPARQL query in Jena using its SPARQL engine over its custom RDF storage
engine. The tests have been run on a Pentium Core 2 Quad 2.0GHz with 2GB
of main memory.
As representative C-SPARQL query we chose the following one, in which we
count how many Glue users are interested in the various topics recognized by
Glue in the last 3 minutes.
REGISTER QUERY CountHowManyGlueUsersAreInterestedInEachTopic AS
PREFIX sioc:
SELECT DISTINCT ?topic ?number
FROM STREAM
[ RANGE 3m STEP 10s ]
WHERE { ?post sioc:topic ?topic . }
AGGREGATE {(?number , COUNT , {?topic })}

9
http://jena.sourceforge.net/
Continuous Queries and Analysis of Social Semantic Data with C-SPARQL 9

This simple query showcases all characteristics of C-SPARQL, namely, reg-
istration, selection of triples based on a window over a RDF stream and an
aggregate function.
We registered this C-SPARQL query in our engine and continuously executed
it every second. We run two experiments feeding new triples from the RDF
stream into the C-SPARQL engine at two different rates: 5 triples per second
(5 t/s) and 200 t/s. In both experiments, we measured the time required to
compute the answer with the triples in the window.
It is possible to write a SPARQL query, which computes the same results
as the C-SPARQL query above, by adding (a) a triple pattern that matches
the creation date of the post, (b) a FILTER clause that selects the same time
interval of the C-SPARQL query, and (c) an aggregate function that counts the
number of topics using Jena SPARQL extensions for aggregates.
PREFIX sioc:
PREFIX dcterms:
PREFIX xsd:
SELECT ?topic count(?topic)
WHERE { ?post sioc:topic ?topic .
?post dcterms:created ?date .
FILTER (?date > ‘‘2009-07-20T22:47:00Z’’ˆˆxsd:dateTime
&& ?date < ‘‘2009-07-20T22:50:00Z’’ˆˆxsd:dateTime ) }
GROUP BY ?topic

Using Jena, we executed the above SPARQL query six times against reposito-
ries containing the first 100, 500, 1000, 1500, 2000 and 2500 triples, respectively.
In these experiments, we again measured the time required to compute each
answer.
The results are shown in Figure 2 and are named SPARQL. Comparing the
linear regressions of the three experiments, named Linear(SPARQL), Linear(C-
SPARQL 5 t/s) and Linear(C-SPARQL 200 t/s), we see that the C-SPARQL
window based selection always performs significantly better than the FILTER
based selection of SPARQL in Jena. Notably, this result holds both for a low
triple per second rate of 5 t/s and a reasonably high rate of 200 t/s.

6 Related Work
A previous effort to combine SPARQL and data streams is presented in Bolles
et al. [26]. They introduce a syntax for the specification of logical and physical
windows in SPARQL queries by means of local grammar extensions.
Our approach is different from their in several key aspects. First, Bolles et
al. omit essential ingredients such as aggregate functions, thus the resulting ex-
pressive power is not sufficient to express interesting practical queries such as
“Which are the top topics under discussion?”. Second, the authors do not follow
the approach, established by DSMS, to only use windows to transform streaming
data into non-streaming data in order to apply standard algebraic operations.
Bolles et al. choose to also change the standard SPARQL operators by making
them timestamp-aware and, thereby, effectively introduce a new language seman-
tics. Finally, their approach allows window clauses to appear within SPARQL
10 D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus

Response Time C-SPARQL vs. SPARQL

30
ms

0
0 500 1000 1500 2000 2500
number of triples in the window (C-SPARQL) or in the repository (SPARQL)

SPARQL C-SPARQL 5 t/2 C-SPARQL 200 t/s
Linear (SPARQL) Linear (C-SPARQL 5 t/2) Linear (C-SPARQL 200 t/s)

Fig. 2. The window based selection of C-SPARQL outperforms the FILTER based
selection of SPARQL.

group graph pattern expressions. This makes the query syntax more intricate
and it complicates query evaluation. Moreover, it violates the separation of con-
cerns between stream management and query evaluation that is the basis of our
simple architecture for C-SPARQL engines.

7 Conclusions

We began this paper with a list of questions related to social data on the Web
that stress the streaming nature of blogs and microblogs. Our initial claim was
that social data should not be treated as persistent data to be stored forever
and queried on demand, but rather as transient data to be consumed on the fly
by registered queries. In order to prove this claim, we have made the following
arguments in this paper.

– RDF streams can be defined by extending RDF data type with a notion of
timestamp;
– RDF streams can be serialized as N-Quads;
– sources of RDF streams are available and can be obtained from blogs and
microblogs with the same approach used to obtain the RDF representations
used for social semantic data;
– SPARQL can be extended with the notion of continuous query registered
once and for all that keep monitoring such RDF streams, producing answers
triggered by the streaming data and not by explicit invocation;
Continuous Queries and Analysis of Social Semantic Data with C-SPARQL 11

– C-SPARQL queries can be evaluated using a simple architecture based on
the decision to keep stream management and query evaluation separated;
and
– in terms of response time, even in a naive implementation of this architecture
the window based selection of C-SPARQL outperforms the FILTER based
selection needed to formulate the equivalent query in SPARQL.

Moreover, we do not exploit several optimization opportunities. On the one
hand, we can adopt smarter query rewriting techniques that push part of a
C-SPARQL query evaluation from the SPARQL engine to the DSMS. On the
other hand, we are not considering the parallel nature of streams and thus the
opportunity for parallel continuous query processing.
Finally, we have been limiting ourselves to treat RDF as relational data with-
out considering it part of the Semantic Web stack. We are currently investigating
techniques [27] to incrementally maintain materialization of knowledge derived
by the triples currently selected by the window.

Acknowledgements

The work described in this paper has been partially supported by the European
project LarKC (FP7-215535). Michael Grossniklaus’s work is carried out under
SNF grant number PBEZ2-121230.

References

1. Bojars, U., Breslin, J.G., Peristeras, V., Tummarello, G., Decker, S.: Interlinking
the social web with semantics. IEEE Intelligent Systems 23(3) (2008) 29–40
2. Passant, A., Hastrup, T., Bojars, U., Breslin, J.: Microblogging: A semantic web
and distributed approach. In: 4th Workshop Scripting For the Semantic Web
(SFSW2008) co-located with ESWC2008. (2008)
3. Nowack, B.: Semantic microblogging. In: Microblogging Conference. (2009)
4. Garofalakis, M., Gehrke, J., Rastogi, R.: Data Stream Management: Processing
High-Speed Data Streams (Data-Centric Systems and Applications). Springer-
Verlag New York, Inc., Secaucus, NJ, USA (2007)
5. Della Valle, E., Ceri, S., Braga, D., Celino, I., Frensel, D., van Harmelen, F.,
Unel, G.: Research chapters in the area of stream reasoning. In: SR2009. Volume
466 of CEUR Workshop Proceedings., CEUR-WS.org (2009) online http://ceur-
ws.org/Vol-466/sr2009-intro.pdf.
6. Golab, L., DeHaan, D., Demaine, E.D., López-Ortiz, A., Munro, J.I.: Identifying
Frequent Items in Sliding Windows over On-line Packet Streams. In: Proc. Intl.
Conf. on Internet Measurement (IMC 2003). (2003) 173–178
7. Jagadish, H.V., Mumick, I.S., Silberschatz, A.: View Maintenance Issues for the
Chronicle Data Model. In: Proc. ACM Symp. on Principles of Database Systems
(PODS 1995). (1995) 113–124
8. Liu, L., Pu, C., Tang, W.: Continual Queries for Internet Scale Event-Driven
Information Delivery. IEEE Trans. Knowl. Data Eng. 11(4) (1999) 610–628
12 D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus

9. Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: NiagaraCQ: A Scalable Continuous
Query System for Internet Databases. In Chen, W., Naughton, J.F., Bernstein,
P.A., eds.: Proc. ACM Intl. Conf. on Management of Data (SIGMOD 2000). (2000)
379–390
10. Balakrishnan, H., Balazinska, M., Carney, D., Çetintemel, U., Cherniack, M., Con-
vey, C., Galvez, E., Salz, J., Stonebraker, M., Tatbul, N., Tibbetts, R., Zdonik, S.:
Retrospective on Aurora. The VLDB Journal 13(4) (2004) 370–383
11. Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein,
J., Widom, J.: STREAM: The Stanford Stream Data Manager (Demonstration
Description). In: Proc. ACM Intl. Conf. on Management of data (SIGMOD 2003).
(2003) 665
12. Arasu, A., Babu, S., Widom, J.: The CQL Continuous Query Language: Semantic
Foundations and Query Execution. The VLDB Journal 15(2) (2006) 121–142
13. Babu, S., Widom, J.: Continuous Queries over Data Streams. SIGMOD Rec. 30(3)
(2001) 109–120
14. Law, Y.N., Wang, H., Zaniolo, C.: Query Languages and Data Models for Database
Sequences and Data Streams. In: Proc. Intl. Conf. on Very Large Data Bases
(VLDB 2004). (2004) 492–503
15. Law, Y.N., Zaniolo, C.: An Adaptive Nearest Neighbor Classification Algorithm
for Data Streams. In: Proc. Europ. Conf. on Principles and Practice of Knowledge
Discovery in Databases (PKDD 2005). (2005) 108–120
16. Bai, Y., Thakkar, H., Wang, H., Luo, C., Zaniolo, C.: A Data Stream Language and
System Designed for Power and Extensibility. In: Proc. Intl. Conf. on Information
and Knowledge Management (CIKM 2006). (2006) 337–346
17. Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF.
http://www.w3.org/TR/rdf-sparql-query/
18. Gutierrez, C., Hurtado, C., Mendelzon, A.O.: Foundations of Semantic Web
Databases. In: Proc. ACM Symp. on Principles of Database Systems (PODS 2004).
(2004) 95–106
19. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and Complexity of SPARQL. In:
Proc. Intl. Semantic Web Conf. (ISWC 2006). (2006) 30–43
20. Cyganiak, R.: A Relational Algebra for SPARQL. Technical report, HP-Labs
21. Haase, P., Broekstra, J., Eberhart, A., Volz, R.: A Comparison of RDF Query
Languages. In: Proc. Intl. Semantic Web Conf. (ISWC 2004). (2004) 502–517
22. Carroll, J.J., Bizer, C., Hayes, P.J., Stickler, P.: Named graphs, provenance and
trust. In: WWW. (2005) 613–622
23. Cyganiak, R., Harth, A., Hogan, A.: N-quads: Extending n-triples with context.
http://sw.deri.org/2008/07/n-quads/ (2008)
24. Connolly, D., et al.: Gleaning Resource Descriptions from Dialects of Lan-
guages (GRDDL) - W3C Recommendation. Available on the Web at
http://www.w3.org/TR/grddl/ (11 September 2007)
25. Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-SPARQL:
SPARQL for continuous querying. In: WWW. (2009) 1061–1062
26. Bolles, A., Grawunder, M., Jacobi, J.: Streaming SPARQL – Extending SPARQL
to Process Data Streams. In: Proc. Europ. Semantic Web Conf. (ESWC 2008).
(2008) 448–462
27. Volz, R., Staab, S., Motik, B.: Incrementally maintaining materializations of on-
tologies stored in logic databases. J. Data Semantics 2 (2005) 1–34