<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards A Cache-Enabled, Order-Aware, Ontology-Based Stream Reasoning Framework</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rui Yan</string-name>
          <email>yanr2@rpi.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brenda Praggastis</string-name>
          <email>@pnnl.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Deborah L. McGuinness</string-name>
          <email>dlm@cs.rpi.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>William P. Smith</string-name>
          <email>@pnnl.gov</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Pacific Northwest National, Laboratory</institution>
          ,
          <addr-line>Richland, WA</addr-line>
          ,
          <country>USA, Brenda.Praggastis</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Pacific Northwest National, Laboratory</institution>
          ,
          <addr-line>Richland, WA</addr-line>
          ,
          <country>USA, William.Smith</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Tetherless World, Constellation, Department of</institution>
          ,
          <addr-line>Computer Science</addr-line>
          ,
          <institution>Rensselaer Polytechnic, Institute</institution>
          ,
          <addr-line>Troy, NY</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p />
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>While streaming data have become increasingly more
popular in business and research communities, semantic models
and processing software for streaming data have not kept
pace. Traditional semantic solutions have not addressed
transient data streams. Semantic web languages (e.g., RDF,
OWL) have typically addressed static data settings and linked
data approaches have predominantly addressed static or
growing data repositories. Streaming data settings have some
fundamental di erences; in particular, data are consumed
on the y and data may expire.</p>
      <p>Stream reasoning, a combination of stream processing and
semantic reasoning, has emerged with the vision of
providing \smart\ processing of streaming data. C-SPARQL is
a prominent stream reasoning system that handles
semantic (RDF) data streams. Many stream reasoning systems
including C-SPARQL use a sliding window and use data
arrival time to evict data. For data streams that include
expiration times, a simple arrival time scheme is inadequate
if the window size does not match the expiration period.</p>
      <p>In this paper, we propose a cache-enabled, order-aware,
ontology-based stream reasoning framework. This
framework consumes RDF streams with expiration timestamps
assigned by the streaming source. Our framework utilizes
both arrival and expiration timestamps in its cache eviction
policies. In addition, we introduce the notion of
\semantic importance\ which aims to address the relevance of data
Copyright is held by the author/owner(s).</p>
      <p>WWW2016 Workshop: Linked Data on the Web (LDOW2016).
to the expected reasoning, thus enabling the eviction
algorithms to be more context- and reasoning-aware when
choosing what data to maintain for question answering. We
evaluate this framework by implementing three di erent
prototypes and utilizing ve metrics. The trade-o s of deploying
the proposed framework are also discussed.</p>
    </sec>
    <sec id="sec-2">
      <title>Categories and Subject Descriptors</title>
      <p>C.1.3 [Other Architecture Styles]: Data- ow
architectures|stream reasoning ; D.2.11 [Software Architectures]:
Patterns</p>
    </sec>
    <sec id="sec-3">
      <title>General Terms</title>
      <p>1.</p>
    </sec>
    <sec id="sec-4">
      <title>INTRODUCTION</title>
      <p>
        Streaming data are increasingly pervasive on the web,
however many semantic applications may not be well aligned
with the requirements of streaming data applications[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
The Semantic Web[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] architecture has traditionally
concentrated on storing and linking the web of data, rather than
on managing rapidly changing data streams that become
obsolete over time[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Although capable of processing data
streams at large scale with a high velocity data rate, the
primitive operations in the data-stream management
systems[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] have not typically addressed the extraction of
hidden knowledge via complex reasoning. In 2009, E. Della
Valle et al.[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] introduced the research area of stream
reasoning with the aim to bridge the gap between semantic
reasoning and stream processing. Stream reasoning can
provide many bene ts in application areas that demand
generation and analysis of data streams. Examples include smart
cities[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ][
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], social networks[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and nancial market data
feeds[
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. These data streams often have diverse conceptual
models and physical formats and thus pose challenges for
e ective semantic processing and reasoning.
      </p>
      <p>The best practices for linking static information on the
web have inspired several solutions to face these challenges.
Examples include extending RDF and Linked Data
principles to model data streams, extending SPARQL to
continuously process RDF streams, and implementing e cient
stream reasoning systems.
1.1</p>
    </sec>
    <sec id="sec-5">
      <title>RDF streams</title>
      <p>
        In 2009, E. Della Valle et al.[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] envisioned two alternative
RDF stream formats, namely the RDF molecules stream and
RDF statements stream. The former is an in nite number
of pairs &lt; , &gt;, where is an RDF molecule[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], and
is a timestamp denoting the arrival time of ; the latter
is a special case of the former, where only contains one
statement.
      </p>
      <p>
        D. F. Barbieri et al.[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposed an approach to publish
data streams as Linked Data. In this work, an RDF stream
is de ned as an ordered sequence of pairs, each of which
consists of an RDF triple and a monotonically non-decreasing
timestamp .
      </p>
      <p>
        An RDF stream[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is identi ed by a unique IRI that is
a streaming source locator, and is published in a named
graph[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Each named graph is given an IRI that is
designed following the guidelines of Cool URIs[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and best
practices on how to publish Linked Data on the Web[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>RDF streams are generated and consumed in a certain
order. This order can be a natural time-based order, or
it may use another ranking criteria such as importance or
precision. Thus, semantic RDF Stream Processing (RSP)
systems should be able to manage data with rank-awareness
and time sensitiveness.
1.2</p>
      <p>
        Existing RDF Stream Processing Systems
C-SPARQL[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], among the initial RSP1 languages, is a
continuous extension of the standard SPARQL. It is tailored
to semantically process data streams and facilitate
reasoning. A C-SPARQL query is registered in a form of either a
stream or a query, prior to the arrival of data streams. Its
execution model is inherited from CQL[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], including
operators of stream-to-relation, relation-to-relation and
relationto-stream. Its built-in translator will translate a C-SPARQL
query into static and dynamic parts, and execute them
separately. The C-SPARQL engine can be used as a linked data
stream publisher[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        Other works such as EP-SPARQL[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], TrOWL[
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] and Stream
SPARQL[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] are either extensions of SPARQL from di
erent angles or are built from scratch to ful ll the purpose
of continuously processing and reasoning on data streams.
The IMaRS[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] algorithm has been proposed by the same
authors of C-SPARQL, and is focused on inferred statement
management (mainly statement deletions) in a time-based
sliding window. This algorithm assigns an expiration
timestamp for each RDF statement entering into the window,
labels and updates all the related inferences with this
expiration timestamp. The expiration timestamp is the time
      </p>
      <sec id="sec-5-1">
        <title>1https://www.w3.org/community/rsp/</title>
        <p>when the explicit data exit the window, and is calculated by
adding the data arrival timestamp and the window size. A
deletion is triggered when the original explicitly stated data
exits the window and both explicit and inferred statements
will be deleted. However, IMaRS is not adequate to process
RDF streams with source-assigned expiration timestamps,
because it cannot control when the data expire and lacks
the ability to delete the data that expire before exiting the
window. Hence, two problems will be caused. The First
In First Out (FIFO) eviction strategy can evict unexpired
data, data with a valid period longer than the window size,
which still have the potential to contribute to the reasoning
and query. The FIFO eviction strategy can also let expired
data, data with a valid period shorter than the window size,
generate invalid reasoning and query results in the window.
Thus, not only the arrival order, but also the expiration
order of RDF streams has a big e ect on the RSP outputs.
1.3</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Assumptions and Contributions</title>
      <p>This paper proposes a cache-enabled, order-aware,
ontologybased stream reasoning framework. This framework is able
to process and reason across dynamic streaming data and
provide correct results. It uses a background ontology that
describes the domain knowledge where the streaming data
is interpreted. It also leverages a data cache and a set of
order-aware data management strategies. Our framework is
built under the following assumptions.</p>
      <p>The background ontology is provided by domain
experts and does not change during processing.</p>
      <p>The streaming sources encapsulate the streaming data
in unique named graphs and assign expiration
timestamps.</p>
      <p>An arrival timestamp is assigned to each streaming
graph by the framework.</p>
      <p>Under these assumptions, we list the following contributions.</p>
      <p>We leverage a data cache and order-aware algorithms2
in a stream reasoning context to manage and process
RDF streams.</p>
      <p>We de ne semantic importance as a ranking strategy
and show its value to distinguish a cache from a
window, facilitate order-awareness and data management
in the cache.</p>
      <p>We implement three prototypes3 based on o -the-shelf
triplestores, and evaluate them under di erent cache
con gurations4 from the following aspects:
{ runtime of query, reasoning, reasoning
explanation and data eviction
{ the statistics of precision, recall and F-measure
We discuss the trade-o s to deploy our framework in
di erent scenarios where small, medium and large RDF
streams are processed.</p>
      <sec id="sec-6-1">
        <title>2We introduce them in details in Section 3. 3https://github.com/raymondino/CacheStreamReasoning 4We cover cache con gurations in Section 3.</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>APPROACH</title>
      <p>
        We use the cache as our framework's central component to
manage the RDF streams. Similar to what a window does in
other RSP settings, the cache works as a stream-to-relation
(S2R) operator[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], isolating a portion of the unbounded RDF
streams. However, crucial di erences between them lie in
both their data consumption rules and data eviction
strategies. The window consumes RDF streams by sliding along
them. This restricts data eviction to only be FIFO, namely
the window is only able to move forward by deleting some old
data. The cache keeps static and is fed by RDF streams. It
has exibility to utilize semantics (the background ontology
and any processing results) to inform its eviction and
potentially consumption. All cached RDF streams can be ranked
using di erent types of criteria. Temporal orders (arrival
and/or expiration) are most common although other
considerations including precision, popularity, trust, certainty,
and provenance have been proposed [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. We can leverage
all of these rankings in our data eviction policies and in
addition, we add reasoning perspectives to the criteria for
ranking.
      </p>
      <p>FIFO can be realized by ranking arrival timestamps. First
Expired First Out (FEFO) can be realized by ranking
expiration timestamps to guarantee valid results. Semantic
ranks can also be provided from the reasoning and/or query
participation status of the data.</p>
      <p>In order to present these semantic ranks, we would like
to introduce the notion of semantic importance. For each
named graph in the cache, its semantic importance (SI) is
de ned as an indicator that measures its contribution to the
reasoning and/or query results. The cache ranks all named
graphs according to their SI. This notion is intentionally
abstract so as not to limit its application in di erent data
eviction strategies where a named graph's contributions are
speci ed. For every named graph in FIFO, its SI is based on
its arrival timestamp. For every named graph in FEFO, its
SI is based on its expiration timestamp. The cache also uses
Least-Frequently-Used and Least-Recently-Used algorithms
to semantically rank the data. In Least-Frequently-Used,
SI is embodied as a total reasoning-participation frequency
counter of a named graph. In Least-Recently-Used, SI is
embodied as the most recent reasoning-participation
timestamp of a named graph. SI can also be composite under the
scenario where multiple data eviction strategies are applied.</p>
      <p>
        We have discussed above that there are distinctions
between a window and a cache, and SI is among the keys to
distinguish them. SI is mentioned in several published
papers but never as a core topic. In[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], the authors use SI
as an attribute of interface objects to improve the design of
traditional GUIs in the Computer Human Interaction (CHI)
domain. In[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], the authors use SI to lter Snooker balls in
order to 3D reconstruct the game for analysis. In[
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], the
authors use SI to di erentiate the semantic components in
semantic-linked network. Nonetheless, none of these works
formally de ne SI. This, together with the value that the SI
provides in the stream reasoning context, contributes to our
motivation for de ning it for the stream reasoning context.
      </p>
      <p>
        Our framework utilizes background knowledge encoded in
OWL[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] when processing RDF streams. The ontology is
loaded into the cache prior the data arrival, and provided by
the domain experts. It describes the speci c domain
knowledge necessary to interpret the RDF streams. However, its
expressiveness should be considered together with the query
and deployed reasoner ability. If the ontology is in a
description logic (DL) pro le and the query requires DL reasoning,
but the reasoner only provides RDFS reasoning, then the
goal will not be achieved. Generally speaking, the purpose
of the ontology is to provide enough domain background
information and enable reasoning, which will help the query
answer questions that cannot be answered directly from the
explicit data.
2.1
      </p>
    </sec>
    <sec id="sec-8">
      <title>Framework Architecture</title>
      <p>Our framework consists of four sequential components:
data consumption, querying and reasoning, reasoning
explanation and data eviction. These four components are
executed in order. The execution ow forms a loop to process
data streams and to produce query results in a continuous
fashion, as it is shown in Figure 1.
2.1.1</p>
      <sec id="sec-8-1">
        <title>Data Consumption Component</title>
        <p>In this work, we extend the RDF stream format by adding
another timestamp and a unique graph ID. An RDF stream
is represented by &lt; ; a; e; G &gt;, where denotes the RDF
molecule/statement, a denotes its arrival timestamp, e
denotes its expiration timestamp and G denotes its unique
graph ID. The RDF streams are rst consumed by the Data
Consumption Component (DCC)at Stage 1. DCC works
like a storm spout. When the cache is not full, DCC sends
RDF streams to it. When the cache is full, DCC stops
sending. The incoming RDF streams will wait till the next cache
opening. The data structure in the cache is a minimum heap,
which enables the cache to be order-aware. The key feature
of the minimum heap is that it always keeps the smallest
element at the top. Evicting the top element takes O(1)
time, then the minimum heap takes O(logn) time to update
its top with the smallest element among the preserved ones.
If we incorporate this feature with SI, the least semantically
important data will be at the top and can be evicted easily.
The bigger the SI is, the more important the data is, the
less likely it is evicted.</p>
        <p>The arriving data will be immediately ranked by the cache
and ready for the next step.
2.1.2</p>
      </sec>
      <sec id="sec-8-2">
        <title>Querying and Reasoning Component</title>
        <p>When the cache is full, the processing moves to Stage 2
- Querying and Reasoning Component. The query in the
stream reasoning scenario should be continuous in order to
provide proactive answers. This requires the query to be
pre-registered in the system before the data arrival. We use
standard SPARQL query that will be executed continuously
5 in the framework , which is shown in Figure 1.</p>
        <p>In our framework, reasoning happens during the query
time. This provides an advantage that only the necessary
entailments for the answer will be computed. However, we
would like to point out that reasoning does not have to
happen at the same time as the query. One example is
materialization that is performed iteratively: the materialized
snapshot of the database is always updated as long as the
new data arrive.
5This continuous standard SPARQL is another key di
erence among C-SPARQL, EP-SPARQL and other extended
SPARQL as the latters require di erent execution models
and syntaxes so that the learning curve might be steep for
users.
)
s
h
p
a
r
g
f
o
r
e
b
m
u
n
(
n
=
e
z
i
s
e
h
c
a
c</p>
        <p>Cache
&lt;http://streamreasoning/graphID1&gt;
&lt;http://streamreasoning/graphID2&gt;
&lt;http://streamreasoning/graphID3&gt;
&lt;http://streamreasoning/graphID4&gt;
&lt;http://streamreasoning/graphID5&gt;
&lt;http://streamreasoning/graphID6&gt;
&lt; 1 , a1 , e1&gt;
&lt; 2 , a2 , e2&gt;
&lt; 3 , a3 , e3&gt;
&lt; 4 , a4 , e4&gt;
&lt; 5 , a5 , e5&gt;
&lt; 6 , a6 , e6&gt;</p>
        <p>...
&lt;http://streamreasoning/graphIDn&gt;
&lt; n, an, en&gt;
sends
reasoning
reasoning
background
ontology
reasoning explanation</p>
        <p>reasoning explanation
re-ranking
order-aware data
management</p>
        <p>strategy
eviction
eviction</p>
        <p>RDF streams
&lt; 1 , a1 , e1 , G1&gt;</p>
        <p>...
&lt; i , ai , ei , Gi&gt;</p>
        <p>enters
Stage 1:</p>
        <p>Data Consumption</p>
        <p>Component</p>
        <p>cache is full
Stage 2:</p>
        <p>Querying and Reasoning</p>
        <p>Component
Stage 3:</p>
        <p>Reasoning Explanation</p>
        <p>Component
reasoning done
re-ranking done
Stage 4:</p>
        <p>Data Eviction Component</p>
        <p>evicts
Evicted RDF streaming data
&lt; j , aj , ej , Gj&gt;</p>
        <p>...
&lt; k , ak , ek , Gk&gt;
output
query
results</p>
        <p>After the query results are delivered, the cache needs
to know the reasoning-participation status of every named
graph. In Stage 3, the Reasoning Explanation Component
works in a way to trace back to the provenance of the
inference; that is to nd which cached named graphs
participated in the reasoning process. The reasoning is explained
by proof trees. A resulting inferred statement can be
entailed by both explicit triples and intermediate inferences.
Only explicit triples' graph IDs and reasoning-participation
will be recorded. This provides the foundation to collect
the statistics of the reasoning-participation for each of the
named graphs. The statistics are di erent under di erent
ranking algorithms, and we will cover the details in the Data
Eviction Component.
2.1.4</p>
      </sec>
      <sec id="sec-8-3">
        <title>Data Eviction Component</title>
        <p>This component evicts the named graphs with small SI
from the cache. We have applied three ranking algorithms
shown in Table 1. One and only one ranking algorithm is
applied in the cache at one time to determine which named
graph needs to be evicted. In each framework execution
cycle, every named graph's SI is updated by the statistics
collected, as Figure 2 shows.</p>
        <p>FEFO collects one statistic { the expired data count, Ed.
Ed increments by one if a named graph is expired. The SI
under FEFO is the expiration timestamp, which will not
change. LRU collects two statistics { Ed and the most
recent reasoning-participation timestamps, mrrp, for
every named graph. The SI is updated by incrementing Ed
by one if a named graph is expired, and replacing every
valid named graph's old mrrp with the corresponding new
mrrp. LFU collects two statistics { Ed and every named
graph's reasoning-participation frequency counts, rpfc, in
the latest cycle. The SI is updated in the following way.
Ed increments by one if a named graph is expired. Every
valid named graph's total reasoning-participation frequency
counter, Ctrpf , adds the corresponding rpfc.</p>
        <p>The cache re-ranks the data immediately after the SI is
updated. We introduce a parameter called eviction amount,
Ea, to guarantee a minimum percentage of the cache is
available for the insertion of new data.6 When composite SI,
like expiration timestamp and reasoning-participation
frequency, is applied, it is possible that data with long
expiration timestamps do not participate in the reasoning at all.
While valid, this data are less semantically important, and
the framework should make this data available for eviction,
till the actual amount of data evicted is equal to Ea.</p>
        <p>Figure 3 shows the data eviction procedure in this
component. The cache will rst delete all data in the expired data
pool (that is Ed data). Then a comparison between Ea and
Ed will be made. If Ea is greater, the cache will continue
to delete top (Ea Ed) data. Otherwise, no data will be
evicted.
2.2</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Framework Implementation</title>
      <p>We have implemented our framework using two o
-theshelf triplestores, AllegroGraph V5.0.27 and Stardog 4.0 RC38.
Both provide Java APIs and step-by-step tutorials.
AllegroGraph is a modern and e cient graph database and a
commercial product of Franz Inc9. It supports up to OWL2 RL
reasoning and full SPARQL 1.1. Stardog is a graph database
by Complexible Inc10. It supports up to OWL2 DL &amp;
rulebased reasoning and SPARQL 1.1. We use the free versions
of both products.</p>
      <p>Stardog has an important feature - the ability to support
reasoning explanation. An entailment is explained by
tracking back to the original asserted statements, which form a
proof tree that includes all the statements involved during
the reasoning. Stardog also supports merging proof trees
because the same inferred statements can be generated
utiliz</p>
      <sec id="sec-9-1">
        <title>6We cover the details of Ea in Section 3.</title>
        <p>7http://franz.com/agraph/allegrograph/
8http://stardog.com/
9http://franz.com/
10http://complexible.com/
ing di erent reasoning paths. This allows us to analyze and
rank the reasoning-participation of each data in the cache.
Unfortunately, AllegroGraph does not provide similar
functionality thus we can only manage data under the FEFO
strategy with it.</p>
        <p>The cache is implemented in the triplestore. It has a xed
size, Cs, limiting the contained maximum data amount.
Eviction amount, Ea, as mentioned in Section 2, denotes
the least amount of data to be deleted in each framework
execution cycle. The SPARQL drop argument is leveraged
to fully remove named graphs from the cache in the data
eviction component. We avoid using the SPARQL delete
because it only removes the statements in the graphs, not
the graph ids. This will pollute the cache.</p>
        <p>Stardog supports both memory &amp; disk-based databases,
while AllegroGraph only supports disk-based databases. We
have implemented three prototypes, focusing on Stardog
memory &amp; disk-based and AllegroGraph disk-based cache
stream reasoning system. As we have already mentioned,
the reasoning abilities of these two triplestores are di erent.
In order to perform a fair comparison we use a query that
only requires RDFS reasoning11.</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>EVALUATION</title>
      <p>
        The stream reasoning community has not yet come to
consensus on the best method to evaluate stream reasoning
applications. In 2012 SRbench[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] was proposed as a general
benchmark system designed to test streaming RDF/SPARQl
engines. In the same year, LSBench[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] was proposed to
focus on assessing di erent Linked Stream Data (LSD)
applications' capabilities. The following year CSRBench[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] was
proposed with a special emphasis on the e ects of operation
semantics for stream reasoning applications. More recently
CityBench[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] was proposed to target smart city applications.
We chose the LUBM benchmark[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] dataset because it is
easy to use and satis es our needs. LUBM provides a
wellconstructed ontology describing the relations among
universities, professors and students etc. It also features a data
generator which accepts customized parameters to generate
arbitrary ABox data.
      </p>
      <p>Our work produced 6,031,109 ABox triples. A data source
generates streaming data by disk-reading this generated data
line-by-line from a static n-triples le. We con gured the
streaming throughput as follows:</p>
      <p>The streaming source packs either 1, 10 or 100 triples
per named graph, Tpg.</p>
      <p>
        Each graph is assigned a unique graph id and
expiration timestamp by the streaming source12.
11The team used Stardog to test OWL-DL reasoning on
streaming data[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], but these results are not included within
the scope of this paper.
12http://streamreasoning.org/slides/2015/10/sr4ld2015-02
      </p>
      <p>The framework assigns an arrival timestamp to each
arriving named graph in a monotonically non-decreasing
order.</p>
      <p>The streaming data is then streamed to our three
prototypes, where the cache is con gured as follows:</p>
      <p>Cs is either 10, 100 or 1000 graphs.</p>
      <p>Ea is either 25%, 50%, 75% and 100% of Cs.</p>
      <p>The pre-registered query, as shown in Figure 4,
requires RDFS reasoning.</p>
      <p>The background ontology is pre-loaded into the cache
Our evaluation platform speci cations include 14.04 64bits
Ubuntu LTS operating system, Intel(R) Xeon(R) CPU
E52620 v2 @2.10GHz, 2040MB memory, and 16GB HDD.</p>
      <p>Together with the generated ABox data and provided
LUBM ontology we conducted 252 experiments and obtained
a ground truth of 27,192 results as the basis to test the
framework's correctness on precision, recall and F-measure.
We also recorded each prototype's memory consumption,
runtime of querying, reasoning explanation and data
eviction under di erent con guration combinations of streaming
data and cache. We did not record reasoning explanation
time for all the FEFO strategy and other strategies with
the Ea = 100% of Cs, because the FEFO does not need
reasoning explanation, and it does not make sense to explain
reasoning as the whole cache will be dumped.</p>
      <p>Using these results we ask the following questions:
1. What are the e ects of di erent caches from di erent
producers (AllegroGraph v.s. Stardog), types
(diskbased v.s. memory-based) and con gurations (di
erent combinations of Cs and Ea)?
rsp-extensions.pdf, Page 7, from \Streaming Reasoning for
Linked Data 2015\ by J-P Calbimonte, D. Dell'Aglio, E.
Della Valle, M. I. Ali and A. Mileo.
2. How do various strategies perform under di erent
combined con gurations of the streaming data and cache
(currently we are using F-measure to evaluate the
performance)?
3. What are the trade-o s to consider when deploying our
framework in di erent scenarios?
We show two gures of the total forty-six visualizations13
generated from the results to answer the rst two questions.
The third one will be thoroughly discussed in the next
section.</p>
      <p>For the sake of convenience, we use the following
abbreviations in the rest of this paper: prototypes are
abbreviated such that SM denotes Stardog memory-based cache,
SD denotes Stardog disk-based cache and AD denotes
AllegroGraph disk-based cache. Each test case is labeled as &lt;
prototype abbreviation &gt; &lt; data management strategy &gt; &lt;
streaming data con guration &gt;. For example, SD FEFO 1
denotes a Stardog disk-based cache that performs FEFO
strategy to process RDF streams with 1 triple per graph.</p>
      <p>Figure 5 shows the F-measure performances brought by
caches of di erent producers, types and con gurations. There
are several facts that can be easily captured:</p>
      <p>The F-measure increases as the streaming throughput
increases.</p>
      <p>The F-measure at 50% is always best, and 100% is
always worst for all visualized cases.
13Please refer to out github repository for all visualizations.
AD FEFO performs similarly as others do when
streaming throughput is 1 triple per graph, but outperforms
as the streaming con guration increases.</p>
      <p>SD FEFO and SM FEFO compete with each other in
each test run.</p>
      <p>This can partially answer the rst question, with the points
that di erent brands do a ect the F-measure performance,
but di erent types do not have a signi cant in uence. The
greater the streaming throughput, the more in uences are
made on the F-measure. However, in order to give a
thorough answer we need to look at other metrics before
assessing the overall performances of these caches. For example,
AD FEFO gives the best F-measure, but does it take more
time to execute the query and data eviction?</p>
      <p>Figure 6 shows the F-measure performance by di erent
strategies for the SM cache with di erent streaming data
con gurations. The observed facts are:</p>
      <p>The F-measure score increases as the streaming
throughput increases.
50% eviction amount has the biggest F-measure score,
100% has the smallest score for all the cases.</p>
      <p>For the same streaming data con guration, LFU
always performs best, followed by LRU and then FEFO.
These observations can answer the second question from the
perspective of big cache size. Nevertheless, does cache size
a ect these strategies' performances? Does it take longer
time to explain all of the inferences? If yes, is it worthwhile
to sacri ce system responsiveness for a better F-measure?
Additional observations are as follows:</p>
      <p>F-measure score: Our raw experimental results have
shown that F-measure increases as the cache size
increases. The F-measure score is also a ected by
different triplestores. We believe this is because
AlllegroGraph's and Stardog's inner processing engines and
mechanisms are di erent 14.</p>
      <p>Memory consumption: when Cs = 10, cases of Tpg =
1 require 3 times on average the memory of bigger
streaming con gured cases. When Cs = 100, the
overall memory consumption decreases as the eviction amount
increases. When Cs = 1000, the overall memory
consumption increases as the eviction amount increases.
However, within this evaluation, no signi cant di
erences are observed between memory-based cache and
disk-based cache. Small cache and streaming
throughput cases usually requires more memory. Average FEFO
strategy memory consumed is 41.93MB, LRU is 40.4MB
and LFU is 39.59MB.</p>
      <p>Query time: AD requires the most query time for
all cases. SD requires less time than AD but more
than SM. The query time increases as cache size and
streaming throughput increases. One potential reason
for this is because a disk-based cache needs some IO
time when executing a query, which takes more time
than a memory-based cache. Average query time for
all FEFO cases is 18ms, LRU is 15ms, LFU is 13ms.
14Exploration and explanation of triplestores' inner
implementations are out of the scope of this paper
Explanation time: reasoning explanation time increases
approximately linearly as cache and streaming
throughput increases. In most cases, LFU takes longer time to
explain. Average LFU explanation time for all
LFUcases is 90371ms, LRU is 86100ms.</p>
      <p>Eviction time: eviction time increases as eviction amount,
cache size and streaming throughput increases. AD
eviction time is very fast, 30ms on average. SD on
average is 2879ms. SM on average is 4042ms.</p>
    </sec>
    <sec id="sec-11">
      <title>DISCUSSION</title>
      <p>According to the cache and streaming data con gurations,
there can be either 10, 100, 1,000, 10,000 or 100,000 triples
in the cache during one processing loop. We identify a small
case where 10 or 100 triples are processed, a medium case
where 1,000 or 10,000 triples are processed, and a large case
where 100,000 triples are processed. Together with Table 2
We present a thorough comparison among triplestores, cache
types and data management strategies under these
scenarios.</p>
      <p>We summarize our experimental results in Table 2 to help
answer the third question in the previous section, i.e., what
are the trade-o s to consider when deploying our framework
in di erent scenarios?</p>
      <p>Under the small case, AllegroGraph's query and eviction
time is several times that of Stardog. Disk and memory
performs equally on F-measure, though disk requires more
time to query, evict and explain. LFU performs best in
Fmeasure, followed by LRU then FEFO. Though FEFO needs
more time to query and evict, the explanation time required
by LRU and LFU is signi cantly greater. The trade-o s
in deploying our framework under the small scenario is
dependent on the use case. If system responsiveness is the
rst class citizen, a FEFO strategy will be chosen since it
does not require explanation and provides a ne F-measure.
Stardog memory cache can be chosen since it provides faster
execution time and better F-measure. If F-measure is most
important, LFU is the right strategy. Stardog memory cache
is the best as memory cache provides less explanation time.
It is also noticed that Ea = 25% or 50% of Cs provides
better F-measure.</p>
      <p>For medium cases, Stardog's eviction time increases
signi cantly. Though Stardog's query time is better than
AllegroGraph's, the di erence is very small when compared
with the eviction time. Disk cache provides less eviction
time; its other metrics are similar as memory caches. LFU
is the best at F-measure scores, but is traded for longer
explanation time. Actually FEFO provides decent F-measure
without explanation time. Overall, AllegroGraph disk cache
with FEFO is most suitable for this scenario. Ea = 50% or
75% of Cs provides best F-measure as well.</p>
      <p>For large cases, AllegroGraph's eviction time, F-measure
and memory performs better, though query is slower. Disk
cache query time is 9 times greater than the memory
dependent graph database, but provides better F-measure,
explanation , eviction time and memory. FEFO performs best in
F-measure but it spends more time on query, with smallest
eviction time and memory consumption. Hence,
AllegroGraph disk cache with FEFO is most suitable for this
experiment's use case. It is also recommended to use Ea = 50%
of Cs to provide the best F-measure.
5.</p>
      <p>CONCLUSIONS AND FUTURE WORK
We have discussed the distinctions between a window and
a cache, highlighted some challenges with a simple sliding
window, and have described the enhanced data eviction
exibility that a cache is able to provide. We have de ned
semantic importance as a ranking strategy and showed how it
can be exploited in a stream-reasoning context. We
implemented semantic importance with a range of settings and
evaluated them, laying the foundation for expanding the
range of data management strategies. We have also
presented our cache-enabled stream reasoning framework. By
leveraging a cache and a set of data management strategies,
this framework is able to consume RDF streams on the y
while performing reasoning and answering standing queries.
The cached RDF streams are ranked according to the
semantic importance, and data are evicted when as less important
or expired. We have implemented three prototypes of the
proposed framework, and evaluated them by emulating an
RDF stream generated by LUBM and time stamped with
arrival and expiration times. We also discussed the trade-o s
of deploying our framework in di erent scenarios.</p>
      <p>In this work, our framework and prototypes are
evaluated on a synthetic situation. Future work includes
applying our framework on realistic data set benchmarks such as
SRbench, as well as apply our work on some compelling use
cases such as smart cities.15</p>
      <p>Our current data eviction strategies are domain
agnostic. We will also develop some domain literate strategies
for our actual scenarios. (We have partially applied the
framework in one streaming NMR setting with one simple
domain-aware strategy yielding promising results). We
noticed that the reasoning environment in this work is limited
due to AllegroGraph's limited reasoning ability, and would
like to explore more complex reasoning in OWL DL and/or
rule-based reasoning as our future work as well.
6.</p>
    </sec>
    <sec id="sec-12">
      <title>ACKNOWLEDGMENT</title>
      <p>The research described in this paper is part of the
Analysis in Motion Initiative at Paci c Northwest National
Laboratory. It was conducted under the Laboratory Directed
Research and Development Program at PNNL, a multi-program
national laboratory operated by Battelle for the U.S.
Department of Energy.</p>
      <p>We would like to thank Dr. Mark T. Greaves from PNNL
for his valuable thoughts and discussions that helped shape
the work and this paper. We also want to thank Dr. Emanuele
Della Valle from the Politecnico di Milano for his inspiring
discussions and encouragement to this work.
7.
15We plan to
http://dublinked.ie/
use</p>
      <p>Dublin</p>
      <p>City
datasets
from
Each value in the table is calculated from raw experimental results. Please refer to our github repository for these raw results. In the
10 triples/cache scenario, for example, disk-based cache's F-measure score is averaged from AD FEFO, SD FEFO, SD LRU &amp; SD LFU
test cases, while memory-based cache's score is averaged from SM FEFO, SM LRU &amp; SM LFU test cases.</p>
      <p>We would like to highlight an empirical comparison of the two cache types, the disk-based cache (DC) and the memory-based cache
(MC).</p>
      <p>SPARQL query time: for every scenario, DC is slower than MC. As the scenario increases, DC grows much faster than MC. The
ratio between 100,000 triples/cache and 10 triples/cache for DC is 15.45, whilst MC is 3.34. Though DC is less restricted by storage
space, its query time slows down the system response time, and this e ect will become worse as the scenario size grows. However,
this is expected, as accessing disk when executing query is always slower than accessing the memory.</p>
      <p>F-measure: for every scenario, DC's and MC's F-measure scores are very similar. This means both types are able to provide same
level correctness. F-measure increases as scenario increases, this is because the more data in the cache, the more correct results can
be calculated, which raises the F-measure.</p>
      <p>Reasoning Explanation Time: in most scenarios, MC's explanation time is slower than DC's. The di erence increases signi cantly as
the scenario size increases. Reasoning explanation is very time-consuming. Strategies (such as LFU) requiring reasoning-explanation
provide better F-measure when the scenario is small and explanation time is quick, but provide similar F-measure when the scenario
is large and explanation time is slow.</p>
      <p>Eviction Time: in small scenarios, DC is slower; in medium and large scenarios, MC is slower. The di erence between DC and MC
increases signi cantly as the scenario size increases. This indicates a bias towards DC for large RDF streams, and MC for small
RDF streams.</p>
      <p>Memory Consumption: Both DC and MC consume similar memory for each scenario. This could be because Stardog triplestores
are implemented very e ciently, but again, explaining this requires the knowledge of inner mechanisms of Stardog, which is out of
the scope of this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A. Mileo.</surname>
          </string-name>
          <article-title>CityBench: A Con gurable Benchmark to Evaluate RSP Engines Using Smart City Datasets</article-title>
          .
          <source>In The Semantic Web-ISWC</source>
          <year>2015</year>
          , pages
          <fpage>374</fpage>
          {
          <fpage>389</fpage>
          . Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Anicic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fodor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rudolph</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Stojanovic</surname>
          </string-name>
          .
          <article-title>EP-SPARQL: a uni ed language for event processing and stream reasoning</article-title>
          .
          <source>In Proceedings of the 20th international conference on World wide web</source>
          , pages
          <volume>635</volume>
          {
          <fpage>644</fpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Arasu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Babu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Widom</surname>
          </string-name>
          .
          <article-title>The CQL continuous query language: semantic foundations and query execution</article-title>
          .
          <source>The VLDB Journal - The International Journal on Very Large Data Bases</source>
          ,
          <volume>15</volume>
          (
          <issue>2</issue>
          ):
          <volume>121</volume>
          {
          <fpage>142</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Braga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Della</given-names>
            <surname>Valle</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Grossniklaus</surname>
          </string-name>
          .
          <article-title>C-SPARQL: SPARQL for continuous querying</article-title>
          .
          <source>In Proceedings of the 18th international conference on World Wide Web</source>
          , pages
          <volume>1061</volume>
          {
          <fpage>1062</fpage>
          . ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Braga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Della</given-names>
            <surname>Valle</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Grossniklaus</surname>
          </string-name>
          .
          <article-title>Continuous queries and real-time analysis of social semantic data with c-sparql</article-title>
          .
          <source>In Proceedings of Social Data on the Web Workshop at the 8th International Semantic Web Conference</source>
          , volume
          <volume>10</volume>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Braga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Della</given-names>
            <surname>Valle</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Grossniklaus</surname>
          </string-name>
          .
          <article-title>Incremental reasoning on streams and rich background knowledge</article-title>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Braga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Grossniklaus</surname>
          </string-name>
          .
          <article-title>An execution environment for C-SPARQL queries</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Extending Database Technology</source>
          , pages
          <volume>441</volume>
          {
          <fpage>452</fpage>
          . ACM,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Valle</surname>
          </string-name>
          .
          <article-title>A proposal for publishing data streams as linked data</article-title>
          .
          <source>In Linked Data on the Web Workshop</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lassila</surname>
          </string-name>
          , et al.
          <article-title>The semantic web</article-title>
          .
          <source>Scienti c american</source>
          ,
          <volume>284</volume>
          (
          <issue>5</issue>
          ):
          <volume>28</volume>
          {
          <fpage>37</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          , et al.
          <article-title>How to publish linked data on the web</article-title>
          .
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Blanch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guiard</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Beaudouin-Lafon</surname>
          </string-name>
          .
          <article-title>Semantic pointing: improving target acquisition with control-display ratio adaptation</article-title>
          .
          <source>In Proceedings of the SIGCHI conference on Human factors in computing systems</source>
          , pages
          <volume>519</volume>
          {
          <fpage>526</fpage>
          . ACM,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Grawunder</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Jacobi</surname>
          </string-name>
          .
          <article-title>Streaming SPARQL-extending SPARQL to process data streams</article-title>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hayes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Stickler</surname>
          </string-name>
          .
          <article-title>Named graphs, provenance and trust</article-title>
          .
          <source>In Proceedings of the 14th international conference on World Wide Web</source>
          , pages
          <volume>613</volume>
          {
          <fpage>622</fpage>
          . ACM,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Cugola</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Margara</surname>
          </string-name>
          .
          <article-title>Processing ows of information: From data stream to complex event processing</article-title>
          .
          <source>ACM Computing Surveys (CSUR)</source>
          ,
          <volume>44</volume>
          (
          <issue>3</issue>
          ):
          <fpage>15</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>E.</given-names>
            <surname>Della Valle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Braga</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Campi</surname>
          </string-name>
          .
          <article-title>A rst step towards stream reasoning</article-title>
          . Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>E.</given-names>
            <surname>Della Valle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <surname>F. Van Harmelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and D.</given-names>
            <surname>Fensel</surname>
          </string-name>
          .
          <article-title>It's a streaming world! Reasoning upon rapidly changing information</article-title>
          .
          <source>IEEE Intelligent Systems, (6):</source>
          <volume>83</volume>
          {
          <fpage>89</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>E.</given-names>
            <surname>Della Valle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schlobach</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krotzsch, A</article-title>
          . Bozzon,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Horrocks.</surname>
          </string-name>
          <article-title>Order matters! Harnessing a world of orderings for reasoning over massive data</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>4</volume>
          (
          <issue>2</issue>
          ):
          <volume>219</volume>
          {
          <fpage>231</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dell'Aglio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Calbimonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Balduini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. Della</given-names>
            <surname>Valle</surname>
          </string-name>
          .
          <article-title>On correctness in rdf stream processor benchmarking</article-title>
          .
          <source>In The Semantic Web{ISWC</source>
          <year>2013</year>
          , pages
          <fpage>326</fpage>
          {
          <fpage>342</fpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Finin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. P.</given-names>
            <surname>Da Silva</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. L.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          .
          <article-title>Tracking rdf graph provenance using rdf molecules</article-title>
          .
          <source>In Proc. of the 4th International Semantic Web Conference (Poster)</source>
          ,
          <source>page 42</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          , and
          <string-name>
            <surname>J.</surname>
          </string-name>
          <article-title>He in. LUBM: A benchmark for OWL knowledge base systems</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          ,
          <volume>3</volume>
          (
          <issue>2</issue>
          ):
          <volume>158</volume>
          {
          <fpage>182</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.</given-names>
            <surname>Le-Phuoc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dao-Tran</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>D.</given-names>
            <surname>Pham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Boncz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Fink</surname>
          </string-name>
          .
          <article-title>Linked stream data processing engines: Facts and gures</article-title>
          .
          <source>In The Semantic Web{ISWC</source>
          <year>2012</year>
          , pages
          <fpage>300</fpage>
          {
          <fpage>312</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>F.</given-names>
            <surname>Lecue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kotoulas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. Mac</given-names>
            <surname>Aonghusa</surname>
          </string-name>
          .
          <article-title>Capturing the pulse of cities: Opportunity and research challenges for robust stream data reasoning</article-title>
          .
          <source>In Workshops at the Twenty-Sixth AAAI Conference on Arti cial Intelligence</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>P.</given-names>
            <surname>Legg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Parry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Morris</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. W</surname>
          </string-name>
          . Gri ths, D. Marshall,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          , et al.
          <article-title>Intelligent ltering by semantic importance for single-view 3d reconstruction from snooker video</article-title>
          .
          <source>In Image Processing (ICIP)</source>
          ,
          <year>2011</year>
          18th IEEE International Conference on, pages
          <volume>2385</volume>
          {
          <fpage>2388</fpage>
          . IEEE,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>D. L. McGuinness</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Van Harmelen</surname>
          </string-name>
          , et al.
          <article-title>OWL web ontology language overview</article-title>
          .
          <source>W3C recommendation</source>
          ,
          <volume>10</volume>
          (
          <issue>10</issue>
          ):
          <year>2004</year>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>L.</given-names>
            <surname>Sauermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          , and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Volkel. Cool URIs for the semantic web</article-title>
          .
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>M.</given-names>
            <surname>Stonebraker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Cetintemel</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. Zdonik.</surname>
          </string-name>
          <article-title>The 8 requirements of real-time stream processing</article-title>
          .
          <source>ACM SIGMOD Record</source>
          ,
          <volume>34</volume>
          (
          <issue>4</issue>
          ):
          <volume>42</volume>
          {
          <fpage>47</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tallevi-Diotallevi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kotoulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Foschini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lecue</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Corradi</surname>
          </string-name>
          .
          <article-title>Real-time urban monitoring in dublin using semantic and stream technologies</article-title>
          .
          <source>In The Semantic Web{ISWC</source>
          <year>2013</year>
          , pages
          <fpage>178</fpage>
          {
          <fpage>194</fpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>E.</given-names>
            <surname>Thomas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          , and
          <string-name>
            <surname>Y. Ren.</surname>
          </string-name>
          <article-title>TrOWL: Tractable OWL 2 reasoning infrastructure</article-title>
          .
          <source>In The Semantic Web: Research and Applications</source>
          , pages
          <volume>431</volume>
          {
          <fpage>435</fpage>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>S.</given-names>
            <surname>WP</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Borkum</surname>
          </string-name>
          .
          <article-title>Semantically enabling stream-reasoning architectural frameworks</article-title>
          .
          <source>Abstract submitted to The 24th ACM International Conference on Information and Knowledge Management</source>
          , Melbourne, Australia.,
          <year>2015</year>
          . PNNL-SA-
          <volume>110228</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Duc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Calbimonte</surname>
          </string-name>
          .
          <article-title>Srbench: a streaming rdf/sparql benchmark</article-title>
          .
          <source>In The Semantic Web{ISWC</source>
          <year>2012</year>
          , pages
          <fpage>641</fpage>
          {
          <fpage>657</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhuge</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Zheng</surname>
          </string-name>
          .
          <article-title>Ranking Semantic-linked Network</article-title>
          .
          <source>In WWW (Posters)</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>