<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the di erence between Complex Event Processing and Dynamic Query Evaluation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mart n Ugarte</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stijn Vansummeren</string-name>
          <email>stijn.vansummereng@ulb.ac.be</email>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The increasing number of applications that require processing and analyzing high-throughput information in real time has fostered a new eld of study known as Complex Event Processing (CEP). The claimed objective of CEP is to develop techniques that are able to cope with the high-throughput requirements of modern Big Data applications. Also, it is commonly argued that CEP systems are di erent from relational reactive systems such as Active Database Management Systems (ADBMSs) or Data Stream Management Systems (DSMSs) because the latter see new data elements as database transactions (generally insertions or deletions) whose order is not relevant. On the contrary, CEP systems see new data elements as events (e.g. sensor measurements) whose arrival time and order have a semantic meaning. Unfortunately, these di erences come from a high-level description of the underlying applications, but do not reveal fundamental computational di erences between the requirements of CEP systems and relational systems. Moreover, recent developments in Dynamic Query Evaluation (DQE) show that general techniques in the relational setting can be more e cient than current CEP algorithms. In this paper, we study whether there is a fundamental di erence between the computational requirements of CEP and DQE. To answer this question, we identify two concrete assumptions of CEP and investigate their e ects in terms of evaluation complexity. Concretely, we show a realistic CEP query that, if the Online Matrix-vector multiplication (OMv) conjecture holds, cannot be evaluated with sub-linear time per tuple followed by sub-linear-delay enumeration, but under the CEP assumptions can be evaluated with constant time per tuple followed by constant-delay enumeration. Sub-linear here means O(n1 "), where n is the size of the active domain.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In recent years, analyzing high-throughput streams in real time has become a
crucial task for many communities. Examples range from Stream Processing [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]
and Financial Systems [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] to Industrial Control Systems [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and On-line Machine
Learning [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. The diversity of these communities and the nature of their
applications has resulted in a variety of frameworks and systems to analyze data
streams. These systems fall in di erent categories (see [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] for an extensive
survey), out of which Complex Event Processing and Dynamic Query Evaluation
are two prominent examples.
      </p>
      <p>
        The term Dynamic Query Evaluation (DQE) refers to techniques whose aim
is to e ciently maintain an up-to-date version of a query result under updates
to an underlying database. DQE is usually based on a form of Incremental View
Maintenance (IVM) [
        <xref ref-type="bibr" rid="ref18 ref3 ref6 ref7">3, 6, 7, 18</xref>
        ], whose objective is to maintain a materialized
version of query results under updates by applying relational operations over
cached results and sub-results. In particular, in DQE the input is a stream of
database transactions that can insert as well as delete tuples. Systems based
on DQE nd their origins in Active Database Management Systems (ADBMS)
and Data Stream Management Systems (DSMS), that were designed to execute
relational operations every time a transaction was executed in a database [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Complex Event Processing (CEP) refers to techniques where the objective
is to detect certain temporal patterns in a high-throughput data stream.
Therefore, elements in the input stream are not seen as database transactions but as
events, that represent real-world observations and that thus cannot be retracted.
Moreover, the order at which these events arrive to the system has a semantic
meaning; CEP patterns commonly express sequences based on this temporal
order. CEP and DQE are usually considered very di erent as is perhaps best
illustrated by Cugola and Margara who, in their well-known survey [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], argue
that \The concepts of timeliness and ow processing are crucial for justifying
the need for a new class of systems". In line with this di erence, specialized
CEP systems were developed to deal with the throughput requirements of
applications scenarios involving CEP patterns. Prominent examples of CEP systems
are SASE [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], Cayuga [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], EsperTech [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], TESLA/T-Rex [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], RTEC [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
among others. While CEP and DQE have hence historically been considered
di erent in the literature, this perceived di erence is not based on fundamental
computational properties, but rather on a high-level description of the targeted
applications of CEP and DQE. Several recent trends make the perceived
difference between CEP and DQE rather shallow. First of all, CEP and DQE
application scenarios seem to intermingle. For this reason, modern distributed
stream processing systems (such as Apache Flink1, Spark Streaming2 or Apache
Storm3) mingle both incremental query maintenance capabilities and CEP
features like time windows or time-based semantics. Second, DQE has received new
attention in the last couple of years [
        <xref ref-type="bibr" rid="ref15 ref21">15,21</xref>
        ] based on a simple yet important idea:
instead of maintaining a materialized version of the query results, maintain a
data structure that can be updated e ciently under changes to the underlying
database, and from which the query results can be e ciently generated. Recent
work shows that relational techniques based on this idea can be more e cient
than the CEP systems cited above [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
1 https:// ink.apache.org
2 https://spark.apache.org/streaming/
3 https://storm.apache.org
      </p>
      <p>This hence raises the question: is there a fundamental computational
difference between CEP and DQE ? In order to answer this question, we take a
fundamental viewpoint, and de ne CEP systems as systems in which the input
is a stream of tuples. In line with the fact that events cannot be retracted, the
stream in insertion-only. Moreover, in line with the fact that arrival order is
important, we require that every tuple contains a special attribute that represents
the arrival time to the system. The value of this attribute is globally increasing
across the stream. In contrast, DQE systems take as input a stream of both
insertions and deletions with no further assumption on the attributes.</p>
      <p>Concretely, we answer the question by showing a query that can be answered
e ciently under the CEP assumptions, but not under the DQE assumptions.
The notion of e ciency here corresponds to process each new event in constant
time (under data complexity) and be able to output the results at any moment
with constant-delay enumeration.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>Tuples, databases, predicates and streams. We assume the existence of
three disjoint in nite sets: A domain D, a set of variables fx; y; z; : : :g (which
play the role of attribute names), and a set of relation names fr; s; t; : : :g.</p>
      <p>
        An atom is an expression of the form r(x), where r is a relation name and
x is a nite set of variables. A tuple over r(x) (or simply over x) is a function
t : x ! D. A relation R over r(x) is a nite set of tuples over r(x). A schema S is
a nite set of atoms with di erent relation names, and a database D over schema
S is a set of relations, one over each atom of S. A stream over schema S is a
possibly in nite sequence S = (S[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]; S[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]; : : :), where each S[i] is a pair (ti; i)
consisting of a tuple ti over any of the atoms in S and a symbol i 2 f+; g
that represents whether S[i] is an insertion or a deletion. A stream is called
insertion-only if all of its elements are insertions. We might abuse notation and
assume the elements of an insertion-only stream are tuples. Given a stream S,
the sub-stream (S[i]; : : : ; S[j]) is denoted by Si;j , and the database generated by
S up to position i, denoted by S[1::i], is the database consisting of all tuples t
that occur in S up to position i (inclusive), and whose last occurrence in S1;i is
an insertion (i.e. (t; +)).
      </p>
      <p>
        Finally, a predicate over x is a (not necessarily nite) decidable set P of tuples
over x. For example, if x = fx; yg then P (x) = f(x; y) j x &lt; yg is the predicate of
all tuples (x0; y0) 2 D2 satisfying x0 &lt; y0. As usual, we use shorthand notation
for predicates (e.g. P (x) = x &lt; y) and write P (t) to indicate that t 2 P .
Query Language. We follow the query language of Generalized Conjunctive
Queries (GCQs) introduced in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>De nition 1. A GCQ is an expression of the form
y r1(x1) on</p>
      <p>m
on rn(xn) j ^ Pi(zi)
where r1(x1); : : : ; rn(xn) are atoms, P1; : : : ; Pm are predicates over z1; : : : ; zm,
respectively, and both y and Sim=1 zi are subsets of Sin=1 xi.</p>
      <p>
        Given a database D and a GCQ Q of the form (1), the evaluation of Q over
D is denoted Q(D) and is performed in the expected way: rst compute the
natural join r1(x1) on on rn(xn), from the result keep only those tuples that
satisfy all predicates P1; : : : ; Pm, and nally project them over y. We illustrate
the semantics of GCQs with an example, for a full de nition see [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
Example 1. Consider the following GCQ.
      </p>
      <p>
        x;y;ts3 r(x; ts1) on s(x; y; ts2) on t(y; ts3) j ts1 &lt; ts2 &lt; ts3 ^(ts3 ts1) &lt; 5) (2)
Intuitively, the query speci es that we should take the natural join between
r(x; ts1), s(x; y; ts2) and t(y; ts3), from this result keep only those tuples that
satisfy ts1 &lt; ts2 &lt; ts3 and (ts3 ts1) &lt; 5, and nally project over fx; y; ts3g.
From the de nitions of databases, streams and the semantics of GCQs it can be
seen that we use set semantics. We stress, however, that this is only for simplicity,
and the results in this paper can be extended easily to bag semantics.
GCQs for Complex Event Processing. It has been claimed in the past that
CEP languages are generally informal and non-standard [
        <xref ref-type="bibr" rid="ref10 ref11 ref26">10,11,26</xref>
        ], presenting a
variety of primitives and operators with sometimes problematic semantics.
Nevertheless, GCQs capture some fundamental requirements of CEP. For example,
in CEP frameworks users can specify that events must occur in a particular
order and inside a bounded time window. Although this is not directly provided
by built-in features of GCQs, it can be simulated by adding a globally-increasing
attribute to tuples and using inequality predicates. For example, in query (2)
we could interpret ts1, ts2 and ts3 as the time of arrival of each event. Then,
this query would select three events of type r, s and t that arrived in that same
order and inside a time window of 5 units of time (we assume seconds).
      </p>
      <p>
        In contrast to GCQs, CEP languages usually have sequencing and time
windows as primitives. For example, under the assumption that ts1, ts2 and ts3
represent arrival order, query (2) can be written equivalently in SASE [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] as
      </p>
      <p>SEQ(r,s,t) WHERE r.1 = s.1 AND s.2 = t.1 WITHIN 5 seconds:
The SEQ operator indicates that the events must arrive in the expected order and
the WITHIN clause that they have to occur within a time window of 5 seconds,
avoiding the explicit inequalities over the timestamps. The WHERE clause indicates
that the rst attribute of r must be equal to the rst attribute of s and the second
attribute of s must be equal to the rst attribute of t, which is equivalent to
using variable names in GCQs.</p>
      <p>For the purposes of this paper, we only require the CEP features of
sequencing, time windows and ltering; it is not relevant that other features of CEP
languages like iteration (Kleene closure) are not expressible by means of GCQs.</p>
    </sec>
    <sec id="sec-3">
      <title>Evaluating Queries over Dynamic Databases</title>
      <p>
        Since we are interested in evaluating GCQs over streams, we need to formally
de ne what evaluation means. Traditionally, the objective of DQE algorithms
like IVM has been to maintain the current version of the result materialized
at all times. Therefore, the processing of an update includes the corresponding
modi cation of the materialized result. In contrast, recent DQE algorithms do
not maintain the full materialized output, but a data structure from which the
result can be generated [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The idea behind this is to separate the enumeration
of results from the process of updating the data structure when the underlying
database changes. Note that this implies that the time required to react to an
update can be much shorter than the size of the change to the output, because the
data structure can be a compressed representation of the output. This is actually
the main reason behind the e ciency gain in recent DQE algorithms [
        <xref ref-type="bibr" rid="ref15 ref16 ref19">15, 16, 19</xref>
        ].
      </p>
      <p>When evaluating a query Q over a stream S, the elements of S are consumed
one by one in order. Once the ith element of S is processed, we expect to be able
to generate Q(S[1::i]) (recall that S[1::i] is the database produced by S up to
position i). Following this, a DQE algorithm A for Q consists of a data structure
DA and two routines PROCESSA and ENUMA such that for every stream S:
{ when PROCESSA receives an element of the stream, it can modify DA and;
{ after PROCESSA has been called with the rst i elements of the stream, ENUMA
enumerates Q(S[1::i]) from DA.</p>
      <p>
        Complexity. We consider query evaluation in main memory and measure time
under data complexity [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. That is, the query is considered to be xed and
not part of the input. This makes sense under DQE, where the query is known
in advance and the data is constantly changing. In particular, the number of
atoms and predicates in the query, their arity, and the length of the query are
all constant. Also, note that we need to measure result enumeration and update
processing separately. Update processing is simply measured as the worst-case
time required to process an element from the stream, i.e. the worst-case time
required by PROCESSA. Formally, given a function f from streams to natural
numbers, we say that a DQE algorithm A provides f update-time if, for every
stream S and i 2 N, PROCESSA takes time O(f (S1;i)) to process S[i].
      </p>
      <p>
        The e ciency of enumeration is slightly more involved, since it is measured
as the delay that the enumeration procedure takes between generating two
consecutive results [
        <xref ref-type="bibr" rid="ref22 ref5">5, 22</xref>
        ]. This is de ned as follows.
      </p>
      <p>De nition 2. A routine ENUM with access to a data structure D supports
enumeration of a set E if ENUM(D) outputs each element of E exactly once.
Moreover, ENUM(D) enumerates E with delay d 2 N, if when ENUM(D) is invoked, the
time until the output of the rst element; the time between any two consecutive
elements; and the time between the output of the last element and the termination
of ENUM(D), are all bounded by d.</p>
      <p>Given a function f from streams to natural numbers, we say that a DQE
algorithm A for query Q provides f -delay enumeration if for every stream S
and every i 2 N, after PROCESSA has processed the rst i elements of S, ENUMA
enumerates Q(S[1::i]) with delay O(f (S1;i)). If f is a constant function, A is
said to provide constant-delay enumeration (CDE) of Q(S[1::i]).
Computational Model. Another important consideration when studying the
e ciency of DQE algorithms is the computational model. Indeed, when dealing
with ne-grained notions like constant-time per update or CDE, the
computational model might a ect the complexity.</p>
      <p>
        We follow the extended RAM model from [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Concretely, we assume a model
of computation where domain values take O(1) space and memory lookups are
O(1). We further assume that every relation R can be represented by a data
structure that allows (1) enumeration of all tuples in R with constant delay (i.e.
the existence of a routine as de ned above); (2) lookups in O(1), meaning that
given t we can decide in constant time whether t 2 R; (3) single-tuple insertions
and deletions in O(1) time; while (4) having size that is proportional to the
number of tuples in R.
      </p>
      <p>
        We further assume the existence of linear-size hash-maps. Concretely a hash
map H over a set of variables x can store an arbitrary data structure H(t) for
each tuple t over x, using size proportional to the sum of the sizes of the data
structures stored. Moreover, given a tuple t we can access H(t) in constant time.
These assumptions amount to perfect hashing of linear size [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Although this
is not realistic for practical computers [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], it is well known that complexity
results for this model can be translated, through amortized analysis, to average
complexity in real-life implementations [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>A Computational Di erence</title>
      <p>As mentioned in the introduction, the di erences between DQE and CEP
generally come from high-level descriptions of the targeted applications, and not from
fundamental di erences in their processing techniques. It makes sense then to
discuss whether CEP really requires new computational insights for achieving
better performance, or DQE algorithms are already optimal for the requirements
of CEP. To understand this, we rst need to identify the concrete requirements
and restrictions that distinguish CEP from DQE. As discussed in the
introduction, CEP adds to the dynamic evaluation problem (1) the requirement of
selecting tuples based on their arrival times and (2) the restriction that witnessed
tuples correspond to real-world events and thus cannot be retracted. This can
be concretely re ected in GCQ evaluation by
1. interpreting some attributes as timestamps and assuming that they occur in
a globally increasing manner, and
2. assuming that streams are insertion-only.</p>
      <p>We show that taking these assumptions into account can make a di erence
in terms of complexity of DQE algorithms. Concretely, we prove that the GCQ
Q =
y;ts3 (r(x; ts1) on s(x; y; ts2) on t(y; ts3) j ts1 &lt; ts2 &lt; ts3 ^ (ts3
ts1) &lt; 5)
can be evaluated with constant update time followed by CDE when restricted
to streams that satisfy the two assumptions above, but not in general.</p>
      <p>
        We start by presenting the hardness result. We show that there is no DQE
algorithm that evaluates Q over streams with sub-linear update time followed
by sub-linear-delay enumeration. Here, sub-linear means O(n1 "), where n is
the number of elements in the active domain of the database produced by
the processed portion of the stream. We use the dichotomy proved recently
by Berkholz et. al. in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which is based on the Online Matrix-vector
Multiplication (OMv) conjecture (see [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]). The full description of the dichotomy is
rather involved, but it implies that a if a CQ is not q-hierarchical then it cannot
be evaluated over dynamic datasets with sub-linear time per update followed
by sub-linear-delay enumeration. For the sake of space, we do not give here
the formal de nition of q-hierarchical queries, but only state that the query
Q0 = y(u(x) on v(x; y) on w(y)) is not q-hierarchical.
      </p>
      <p>Lemma 1. If the OMv conjecture holds, there is no DQE algorithm for
Q =
y;ts3 (r(x; ts1) on s(x; y; ts2) on t(y; ts3) j ts1 &lt; ts2 &lt; ts3 ^ (ts3
ts1) &lt; 5)
with sub-linear time per update followed by sub-linear-delay enumeration.
Proof (Sketch). We prove this by contradiction. Assume that there is a DQE
algorithm A that evaluates Q with sub-linear update time and sub-linear-delay
enumeration. We use this algorithm to de ne a second algorithm A0 that
evaluates query Q0 = y(u(x) on v(x; y) on w(y)) with the same bounds. First, the
data structure DA0 is exactly the same as the data structure DA. Upon the
arrival of an element (t; ), the routine PROCESSA0 will simply call PROCESSA(u; ),
where u is de ned as follows: if t = u(x0), then u = r(x0; 1); if t = v(x0; y0),
then u = s(x0; y0; 2); if t = w(y0), then u = t(y0; 3). Since PROCESSA(u; )
takes sub-linear time, it is clear that PROCESSA0 (t; ) also takes sub-linear time.
Finally, the routine ENUMA0 is equivalent to ENUMA except for the fact that it
needs to project out ts3. It is clear that this process will generate the expected
output with sub-linear-delay enumeration: the lters will be satis ed because ts1
is always 1, ts2 is always 2 and ts3 is always 3. Moreover, enumeration is correct
and without duplicates (even though we project away ts3) exactly because ts3
is always 3. We hence have our desired contradiction as we have constructed a
DQE algorithm with sub-linear update time followed by sub-linear-delay
enumeration that evaluates Q0, which is not q-hierarchical.
tu</p>
      <p>Next, we show a DQE algorithm A that evaluates Q with constant time per
update followed by CDE under the assumptions that streams are insertion-only
and ts1, ts2 and ts3 arrive in a globally increasing fashion. We start by presenting
the data structure DA, to then describe how PROCESSA maintains it. After the
rst i elements of S are processed, DA contains three elements.
{ A hash-map HR that maps each x to the maximum ts1 for which r(x; ts1) 2</p>
      <p>S1;i. Intuitively, HR(x) is the last time that x occurred in an r tuple.
{ A hash-map HS mapping each y to a timestamp. The timestamp HS (y) is
the largest ts1 for which there exist tuples r(x0; ts1) and s(x0; y; ts2) in S1;i
satisfying ts1 &lt; ts2.
{ A relation T over fy; ts3g that contains the actual output Q(S[1::i]).</p>
      <p>Having access to DA the enumeration procedure is trivial, since the output
is already materialized in relation T of DA. We proceed then to describe the
update process.</p>
      <p>Update processing. The objective of PROCESSA, shown in Algorithm 1, is to
maintain all elements of DA consistent with the de nitions given above.
Whenever a new tuple t of type r(x; ts1) arrives, it su ces to insert ts1 as the last
timestamp of x (Line 3). Note that we can only do this because we assume that
timestamps arrive in a globally increasing fashion. As such, the value of ts1 is the
maximum across all timestamps seen so far. When a tuple t = s(x; y; ts2) arrives,
we need to check if the x value has already occurred in an r tuple (Line 5), and
if so, update the value of HS (y) to HR(x). Again, this maintains the consistency
because we know that ts2 is larger than HR(x). Finally, when a new tuple t of
type t(y; ts3) arrives, we check if we need to insert it into T . To this end, we
only need to check the value HS (y), as this value tells us if there are matching
tuples of type r and s in the required time window (Line 8).</p>
      <p>It is clear that PROCESSA runs in constant time under our computational
model. Also, because we assume that ts1, ts2 and ts3 are globally increasing, it
is easy to see that PROCESSA correctly maintains the elements of DA.
Theorem 1. There exists a DQE algorithm that evaluates Q with constant time
per tuple and CDE under the assumption that streams are insertion-only and that
attributes ts1, ts2 and ts3 are globally increasing.</p>
      <p>In this section we have shown a computational di erence between CEP and
DQE, giving a concrete motivation for studying techniques that di er from those
used in the relational setting. We conclude by further discussing this di erence
and providing insight into the particular aspects that might make a query easy
to compute in a CEP setting.</p>
    </sec>
    <sec id="sec-5">
      <title>Discussion</title>
      <p>
        We have presented the di erence between CEP and DQE by means of a very
particular query. We decided to present this query because (1) it features both
sequencing and time windows, making it realistic for CEP (2) the algorithm
to evaluate it with constant time per update followed by CDE allows for a
clear presentation. Nevertheless, one possible criticism is that this query is too
simple, because every new event can generate at most one output tuple, and
therefore constant-delay enumeration is easy to achieve by materializing the
output (which is actually what we do). This is a very valid argument; queries for
which a single tuple can generate multiple outputs are indeed harder to evaluate.
One possibility to present such a query would have been to add variable x to
the output of Q (as presented in Query (2) of Example 1). In this case, we can
still improve the evaluation performance by using dynamic fractional cascading
techniques [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], which provide logarithmic update-time followed by
logarithmicdelay enumeration. This is still sub-linear in the active domain and therefore an
improvement over the general case, but the presentation of this result would have
been far more involved. We are not aware of constant-time per update followed
by CDE in this case. It is also important to notice that we could have presented
a simpler query by removing one of the CEP features. For example, if we remove
from Q the time window (ts3 ts1 &lt; 5), we obtain a simpler DQE algorithm
with the same performance guarantees.
      </p>
      <p>Apart from structural properties of the query, we can also look at the CEP
assumptions. At the moment, we do not know whether both assumptions
(insertiononly streams and globally-sorted timestamps) are required, nor how is complexity
a ected if we remove each of them separately. We envision this and a general
understanding of the query properties that a ect complexity as future work.</p>
      <p>
        Finally, in CEP it is common to distinguish between push-based and
pullbased systems (see for example [
        <xref ref-type="bibr" rid="ref11 ref16">11, 16</xref>
        ]). Intuitively, in a pull-based setting the
system must be ready to generate the complete answer Q(S[1::i]) whenever a
user asks for it. This corresponds to the semantics presented in this paper. On
the contrary, when element S[i] arrives to a push-based system, the system must
inform the user of the output's delta, i.e. the di erence between Q(S[1::i]) and
Q(s[1::i 1]). Interestingly, for the case of push-based systems we can evaluate
Query (2) of Example 1 with logarithmic update time followed by constant-delay
enumeration (of the output's delta). Moreover, if we take out the time window
lter but keep x in the output, we can provide constant-time per update followed
by CDE in the push-based case, but we are not aware of an algorithm achieving
these times in a pull-based setting. All of these insights suggest that push-based
evaluation is simpler than pull-based evaluation; we also envision an in-depth
study of this di erence as future work.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Automation</surname>
          </string-name>
          ,
          <string-name>
            <surname>Production Systems</surname>
          </string-name>
          , and Computer-Integrated Manufacturing.
          <source>Prentice Hall PTR, 2nd edition</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Corporate</given-names>
            <surname>Act-Net Consortium</surname>
          </string-name>
          .
          <article-title>The active database management system manifesto: A rulebase of adbms features</article-title>
          .
          <source>SIGMOD Rec</source>
          .,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Yanif</given-names>
            <surname>Ahmad</surname>
          </string-name>
          , Oliver Kennedy, Christoph Koch, and
          <string-name>
            <given-names>Milos</given-names>
            <surname>Nikolic</surname>
          </string-name>
          . Dbtoaster:
          <article-title>Higher-order delta processing for dynamic, frequently fresh views</article-title>
          .
          <source>VLDB</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Artikis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Marek J.</given-names>
            <surname>Sergot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Georgios</given-names>
            <surname>Paliouras</surname>
          </string-name>
          .
          <article-title>An event calculus for event recognition</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Christoph</given-names>
            <surname>Berkholz</surname>
          </string-name>
          , Jens Keppeler, and
          <string-name>
            <given-names>Nicole</given-names>
            <surname>Schweikardt</surname>
          </string-name>
          .
          <article-title>Answering conjunctive queries under updates</article-title>
          .
          <source>In PODS</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Stefano</given-names>
            <surname>Ceri</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jennifer</given-names>
            <surname>Widom</surname>
          </string-name>
          .
          <article-title>Deriving production rules for incremental view maintenance</article-title>
          .
          <source>In VLDB</source>
          ,
          <year>1991</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Rada</given-names>
            <surname>Chirkova</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jun</given-names>
            <surname>Yang</surname>
          </string-name>
          . Materialized Views. Now Publishers Inc.,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>T.H.</given-names>
            <surname>Cormen</surname>
          </string-name>
          .
          <article-title>Introduction to Algorithms, 3rd Edition:</article-title>
          . MIT Press,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Graham</given-names>
            <surname>Cormode</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S</given-names>
            <surname>Muthukrishnan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Wei</given-names>
            <surname>Zhuang</surname>
          </string-name>
          .
          <article-title>Conquering the divide: Continuous clustering of distributed data streams</article-title>
          .
          <source>In ICDE</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Gianpaolo</given-names>
            <surname>Cugola</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Margara</surname>
          </string-name>
          .
          <article-title>Tesla: a formally de ned event specication language</article-title>
          .
          <source>In DEBS</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Gianpaolo</given-names>
            <surname>Cugola</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Margara</surname>
          </string-name>
          .
          <article-title>Processing ows of information: From data stream to complex event processing</article-title>
          .
          <source>ACM CSUR</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Alan</surname>
            <given-names>Demers</given-names>
          </string-name>
          , Johannes Gehrke, Mingsheng Hong, Mirek Riedewald,
          <string-name>
            <given-names>and Walker</given-names>
            <surname>White</surname>
          </string-name>
          .
          <article-title>Towards expressive publish/subscribe systems</article-title>
          .
          <source>In EDBT</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. EsperTech.
          <article-title>Esper complex event processing engine</article-title>
          . http://www.espertech.com/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Monika</surname>
            <given-names>Henzinger</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Krinninger</surname>
          </string-name>
          , Danupon Nanongkai, and
          <string-name>
            <given-names>Thatchaphol</given-names>
            <surname>Saranurak</surname>
          </string-name>
          .
          <article-title>Unifying and strengthening hardness for dynamic problems via the online matrix-vector multiplication conjecture</article-title>
          .
          <source>In STOC</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Muhammad</surname>
            <given-names>Idris</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Ugarte</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Stijn</given-names>
            <surname>Vansummeren</surname>
          </string-name>
          .
          <article-title>The dynamic yannakakis algorithm: Compact and e cient query processing under updates</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Muhammad</surname>
            <given-names>Idris</given-names>
          </string-name>
          , Martin Ugarte, Stijn Vansummeren, Hannes Voigt, and
          <string-name>
            <given-names>Wolfgang</given-names>
            <surname>Lehner</surname>
          </string-name>
          .
          <article-title>Conjunctive queries with inequalities under updates</article-title>
          .
          <source>In VLDB</source>
          ,
          <year>2018</year>
          . To appear.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <article-title>Kurt Mehlhorn and Stefan Naher. Dynamic fractional cascading</article-title>
          .
          <source>Algorithmica</source>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Milos</surname>
            <given-names>Nikolic</given-names>
          </string-name>
          , Mohammad Dashti, and
          <string-name>
            <given-names>Christoph</given-names>
            <surname>Koch</surname>
          </string-name>
          .
          <article-title>How to win a hot dog eating contest: Distributed incremental view maintenance with batch updates</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>Milos</given-names>
            <surname>Nikolic</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Olteanu</surname>
          </string-name>
          .
          <article-title>Incremental view maintenance with triple lock factorisation bene ts</article-title>
          .
          <source>In SIGMOD</source>
          <year>2018</year>
          ,
          <year>2018</year>
          . To appear.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Christos</surname>
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Papadimitriou</surname>
          </string-name>
          .
          <article-title>Computational complexity</article-title>
          .
          <source>In Encyclopedia of Computer Science</source>
          , pages
          <volume>260</volume>
          {
          <fpage>265</fpage>
          .
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Maximilian</surname>
            <given-names>Schleich</given-names>
          </string-name>
          , Dan Olteanu, and
          <string-name>
            <given-names>Radu</given-names>
            <surname>Ciucanu</surname>
          </string-name>
          .
          <article-title>Learning linear regression models over factorized joins</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Luc</surname>
          </string-name>
          <article-title>Segou n. Constant delay enumeration for conjunctive queries</article-title>
          .
          <source>SIGMOD Rec</source>
          .,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Shai</surname>
          </string-name>
          Shalev-Shwartz.
          <article-title>Online learning and online convex optimization</article-title>
          .
          <source>Found. Trends Mach. Learn., (2):</source>
          <volume>107</volume>
          {
          <fpage>194</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Michael</surname>
            <given-names>Stonebraker</given-names>
          </string-name>
          , Ugur Cetintemel, and
          <string-name>
            <given-names>Stan</given-names>
            <surname>Zdonik</surname>
          </string-name>
          .
          <article-title>The 8 requirements of real-time stream processing</article-title>
          .
          <source>SIGMOD Rec</source>
          .,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Moshe</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>The complexity of relational query languages (extended abstract)</article-title>
          .
          <source>In Proc. of STOC</source>
          ,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Haopeng</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Yanlei Diao, and
          <string-name>
            <given-names>Neil</given-names>
            <surname>Immerman</surname>
          </string-name>
          .
          <article-title>On complexity and optimization of expensive queries in complex event processing</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>