<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards a Semiring for Continuous Query Provenance</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sebastian Labbe</string-name>
          <email>sebastien.labbe@ens.psl.eu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Samuele Langhi</string-name>
          <email>samuele.langhi@univ-lyon1.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Angela Bonifati</string-name>
          <email>angela.bonifati@univ-lyon1.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riccardo Tommasini</string-name>
          <email>riccardo.tommasini@insa-lyon.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AMW 2024: 16th Alberto Mendelzon International Workshop on Foundations of Data Management</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ENS Ulm PSL</institution>
          ,
          <addr-line>Paris</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>INSA Lyon</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Lyon 1 University</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <abstract>
        <p>Data provenance encompasses tracking data's origins, transformations, and derivations within a system. Provenance semirings, a mathematical framework used mainly in databases, play a key role in representing and managing provenance information and ofering a structured approach to handle complex provenance queries, such as why-provenance and how-provenance, by encapsulating the essential properties of these models within their algebraic structure. In streaming scenarios, however, such models are not enough to describe the full spectrum of data provenance. Indeed, we need to account for the time dimension in streaming since data is continuously generated and processed in real time. For this reason, we study data provenance under a novel perspective, introducing When-provenance, which aims to describe the origin of the data over the time dimension and provide insights into the temporal dynamics of data transformations. This includes identifying timestamps of data generation, processing intervals, and the timing of query execution stages within defined time windows while minimizing the related performance overhead. Our contribution aims to define a when-provenance formalism to allow modification of a stream query window operator in sublinear time complexity in the input size to allow real-time, enhanced debugging, performance optimization, and audibility of streaming systems. It can also ensure compliance with temporal constraints and policies. By integrating When-provenance mechanisms, streaming query systems can achieve robust and transparent temporal tracking, leading to greater reliability and deeper insights into the temporal behaviour of data streams.</p>
      </abstract>
      <kwd-group>
        <kwd>provenance semirings</kwd>
        <kwd>stream provenance</kwd>
        <kwd>stream processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Data provenance is essential in several areas of data management, such as scientific data processing or
consistent query answering. Given the growing need for processing data in real-time, recent studies
have investigated data provenance models for streaming data and continuous queries [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Such works
on streaming data study the problems of eficient annotation propagation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], explaining missing
answers [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] of continuous queries, or maintaining the query results upon rapid changes [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However,
the characterization of the temporal aspect of continuous queries was neglected. Notably, window
operators are critical in defining continuous query semantics. Window operators address the data
unboundedness problems by dividing the infinite input streams into finite portions based on temporal
conditions.
      </p>
      <p>
        Nevertheless, the stream processing community has long struggled to determine the appropriate
window for continuous queries. This task requires prior knowledge about the data to reduce the risk of
unwanted approximation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Ideally, a given continuous query over the whole stream to obtain the
most complete results. Naturally, this is not possible, as the query is limited by the working memory size.
Moreover, windowing introduces non-monotonicity in the query results [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and non-determinism [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
Given such challenges, it is critical to empower users of stream processing systems with a theoretical
framework that can help explain the evaluation of a window-based continuous query and potentially
help correct it.
      </p>
      <p>This paper presents a provenance framework for tracking the results of continuous queries within a
given window for providing a quality assessment over the adopted query. We called this framework</p>
      <p>CEUR</p>
      <p>ceur-ws.org</p>
      <p>When provenance, and it provides an alternative, interval-based interpretation of why and how
provenance through provenance semirings. To eficiently track When provenance, we designed an
alternative data structure that, given a window, allows for an eficient re-computation of the result
over the window sub-portions. Then, we approach the quality assessment of the window as a counting
problem by calculating the cardinality of the recomputed results and assess whether a window
subinterval is capable of returning (almost) the same number of results of the original window, but with
fewer resources. Finally, we show that such recomputation and the related quality assessment are
sublinear in the number of results.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem Formulation</title>
      <p>This section lays the groundwork for this work and formally defines the targeted problem. More
specifically, it presents a definition of continuous queries with a focus on window operators. Additionally,
it provides an overview of provenance semirings and the various types of provenance, i.e.,  ℎ ,  
lineage ( ), and  .</p>
      <sec id="sec-2-1">
        <title>2.1. Preliminaries</title>
        <p>
          Continuous Queries (CQ) are a kind of query evaluated over an infinite stream. The result of a CQ
evaluation is also a stream, as described in the Continuous Semantics [
          <xref ref-type="bibr" rid="ref5 ref6">6, 5</xref>
          ], which allows extending
query execution to unbounded data. In this paper, we consider the data model of [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], i.e., a stream is a
totally ordered, infinite sequence of records (,  ) ∈ Ω × Τ , where  is a tuple in Ω, and  is a timestamp
in the time domain Τ, that for simplicity we consider as the set of natural numbers. We also consider
time-based window operators since they are popular in existing systems, e.g., Flink [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] or Spark [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]1.
Operationally, time-based sliding windows are defined as a tuple ( ,  ) where  represents the size and 
the slide [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Window operators are defined as a function  ∶ Τ →  from the time domain Τ to the set
of intervals  = Τ × Τ . Hence, applying a window operator  to a stream  assigns each record of  to
an interval  = (  ,   ) ∈  . We write  for a continuous query and  for a stream. We write [] as the
result of the query  on the stream  and []  for the same result restricted to the interval  , i.e.,
[]  = [ ′] where  ′ = { ∣  ∈  ∧  () =  }
        </p>
        <p>
          Provenance semirings [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] represent various provenance models by annotating database tuples
with elements from a set  , e.g., provenance in event tables [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] assumes  =  ( ) , with  ( ) being
the powerset of all tuple IDs  , assigned through function   ∶ Ω →  . A semiring based on  is an
algebraic structure of the form ( , +, ×, 0, 1) , where + and × are addition and multiplication over K,
e.g., + = ∪, × = ∩ when  =  ( ) ; Element 0 and 1 are respectively the neutral element of + and ×,
e.g., 0 = ∅, 1 =  when  =  ( ) . Notably, ( , +, 0) and ( , ×, 1) are commutative monoids and + is
distributive over ×, i.e., , 0 is the annihilating element for ×, i.e., ∀ ∈  ,  + 0 =  ,  × 1 =  , and  × 0 = 0 .
On top of this, a K-relation is a function  that associates elements of the database to elements of the
semiring, e.g.,  ∶ Ω →  ( ) . K-relations propagate provenance annotations according to relational
operators in relational algebra [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. The requirement for finite support of K-relations makes them
inapplicable on infinite streams. Some notable provenance semirings in our scope are: How provenance,
based on polynomials over ℕ, defined as   ( ) = (ℕ[ ], +, ⋅, 0, 1) , captures the most informative form
of provenance; Boolean polynomial provenance, based on polynomials over  , defined as ( ) =
([ ], +  , ⋅, 0, 1), with idempotent addition; Lineage, ( ) = ( ( ), ⋃, ⋃, ⊥, ∅), represents data
lineage in data warehousing; Why provenance, which extends data lineage with subsets, defined as
 ℎ ( ) = ( ( ( )), ⋃, ⋃∪, ∅, {∅}), or as  ℎ ( ) = ([ ]/{ 2 − , ∀ ∈  }, +  , ⋅ , 0, 1).
1For a comprehensive survey on window operators, we suggest [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Problem Statement</title>
        <p>
          Our goal in this paper is to assess the quality of a given window operator  adopted by a continuous
query  . Indeed, defining the correct window for a given continuous query is crucial as it provides the
basis for complex stateful computations: when the window is too small, the CQ misses some relevant
results; when it is too big, the CQ requires too much memory, potentially hindering the performance of
the query. While the first case has been studied by [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] in the context of missing answers identification,
less attention has been paid to assessing if a window requires too many resources wrt query results.
        </p>
        <p>This paper approaches the window quality assessment as a result counting problem. The intuition
is that if query  produces almost or all the results over a portion of a given interval  produced
by  , then  is overestimated. For this reason, we focus on an eficient method for the problem of
window-based result counting.</p>
        <p>Problem 1. Let  be a stream,  a window function,  a continuous query. The result counting problem
over a given interval  returned by  is determining the cardinality of []  ′, for each  ′ ⊆  .</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. The When Provenance Framework</title>
      <p>To study When provenance, we build on the hierarchy defined by Green et al.’s semiring framework. In
particular, we study Why provenance ( ℎ ( )
), and How provenance (  ( )
) as they represent the
minimal, most expressive form of provenance that can support interval representation. Indeed, we do
not use lineage (or which provenance) (( )
) and boolean polynomial provenance (( )
): while
the former is not expressive enough to identify subsets of timestamps (sub-intervals), the latter is as
expressive as  ℎ ( )</p>
      <p>when used for intervals.</p>
      <p>We compose our framework by replacing the tuple IDs  with the set of tuple timestamps  , obtaining
 ℎ ( )
 ℎ ( )
and   ( )
. Given a result tuple  of a query []</p>
      <p>and its Why-provenance element  1 +  2 3 ∈
each monomial  is a minimal list of input tuple IDs necessary for generating the given result
tuple  . For when provenance, the polynomial becomes  1 +  2 3 ∈  ℎ ( )
 is a list of timestamps used to generate the tuple  .</p>
      <p>How provenance, and the related when provenance variant (  ( )
but it includes natural coeficients and exponents in the polynomial, e.g., 
. In this case, each monomial
) is built over the same intuition,
12 + 2 2 3 ∈   ( )
. Such
coeficients and exponents enable the counting of the multiplicity of a given interval in record generation,
which may be useful in certain use cases.</p>
      <p>Since our goal is to track generation intervals, we care about the min and the max of a given monomial
 from an element in  ℎ ( )
or   ( )</p>
      <p>. To obtain a formalism that only keeps such information, our
objective is to replace each monomial 
with an interval  = ( min(),
max()) . To do this, given an
input tuple  and its timestamp   we annotate it with the interval   = ( ,   ) and redefine the How and

Why provenance semirings on these intervals.</p>
      <p>More specifically, we define two new semirings</p>
      <p>Let  be the (infinite) set of intervals, which also include unbounded intervals. We define
 , 0, 1) as the when provenance semiring, where ⋅ is the classic polynomial
multiplication enhanced with × , which is applied when two intervals are multiplied.</p>
      <p>Conversely,  ℎ ( )</p>
      <p>can be defined following the same approach, but without natural coeficients
in the monomials and exponents on labels. For this reason, through  ℎ ( )
we would not be able to
account for multiplicity in the analysis generation intervals.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Eficient When Provenance Querying</title>
      <p>We now discuss the design of an eficient data structure for solving our counting problem. As shown in
Figure 1, our approach is twofold: 1) The initial query   , evaluated on the input stream  , with the
When-provenance annotation described above and on an interval  , i.e., the interval including all data
of interest. This part’s complexity is that of a query with provenance labels. 2) A function  , which
takes as input the result []  and an interval  ′ ⊆  and returns the number of results in the query
[]  ′ with their interval multiplicity. We do so by taking the When-provenance data from the first
part and finding all the result tuples  and their generation intervals   included in  ′.</p>
      <p>
        The time and space complexity of  depends heavily on the data structure that stores the
Whenprovenance annotations. We approach it as a 2D orthogonal range counting problem [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] by storing
all the intervals   contained in []  in a data structure based on balanced binary search trees and
fractional cascading, shown in Figure 1. The intervals are stored and ordered by start-time at the leaves.
Each node in the tree points to a separate list of all intervals in its sub-tree ordered by end-time. In a
node’s list, each interval keeps a pointer to the interval with the closest end-time in child node’s list.
      </p>
      <p>
        To query this data structure with an interval  ′ ⊆  we begin by two binary searches in the root
node’s list (containing all intervals), to find the first and last end-times included in  ′. We then use
the tree to perform the same operation on start times. At each step, we use the pointers between the
lists to update the first and last end times included in  ′ for the new sub-tree. The goal is to find all
the blue nodes of the tree as shown in Figure 1, which creates a partition of the set of intervals with
start times in  ′. At a blue node, we use its stored list and a prefix-sum query to count the number of
intervals with both start and end times in  ′. We then aggregate the results from all the blue nodes.
The first two binary searches are ( log()) each step down is (1) , and there are ( log()) blue
nodes which can be found in ( log()) steps. At each blue node, we count the number of intervals in
(1) . With total query time ( log()) and pre-processing complexity ( log()) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Example 1. To concretely demonstrate how our approach can handle such streaming data scenarios, we
examine a specific example involving a data stream composed of tuples, each containing three attributes:
A, B, and C, accompanied by their corresponding timestamp, denoted as  . The objective is to execute a
query, as shown in Listing 1, which is formulated using the Continuous Query Language (CQL). This
query is designed to perform a self-join operation based on the attribute B, within a defined time window.
The time window is characterized by both a size ( ) and a slide ( ) of 4 time units. After executing the
self-join, the query focuses on projecting the result onto attributes A and C from the joined tuples. In line
with the approach adopted by existing systems, e.g., Apache Flink and Kafka Streams [
        <xref ref-type="bibr" rid="ref14 ref8">8, 14</xref>
        ], we follow
the convention that the result of a join operation in a continuous query will carry the timestamp of the
maximum value between the timestamps of the joined tuples.
      </p>
      <p>SELECT R1.A AS A, R2.C AS C
FROM Stream R1, Stream R2 [RANGE 4 SLIDE 4]
WHERE R1.B = R2.B,</p>
      <p>over values A and C.</p>
      <p>Listing 1: A CQL query that performs a self-join over a tumbling window of 4 time units, projecting
by intervals [1, 5) and [5, 9). The three tables also include the  
-provenance and  ℎ
-provenance
annotations, with the latter described through the semiring defined in Section 4.</p>
      <p>Once the various annotations are obtained, we can follow the approach described in Figure 1. Through
When Provenance annotations, that trace the generation intervals of a given result, we can calculate how
many results are produced within sub-intervals of both [1, 5) and [5, 9). By doing so, we can intuitively see
that sub-intervals [2, 5) and [6, 9) respectively produce the same number of results of intervals [1, 5) and
[5, 9). This equivalence suggests that a reduction of the window size, wrt the one used by the query from
A sample stream (1a) and the related results of the query from Listing (1), executed over records within interval
[1, 5) (1b) and [5, 9) (1c), with the respective ID-provenance   ( )
annotations. In gray the columns related to provenance metadata.
and when-provenance   ( )
initial
Listing 1, may be needed.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This paper discusses the application of Green et al.’s provenance semirings to streaming data and
continuous queries. We demonstrate how the result counting problem for a given query can be
efectively addressed through the introduction of a novel framework. This framework reinterprets
the traditional concept of provenance semirings by adapting them to operate over time intervals,
which is particularly suitable for streaming data where temporal aspects are crucial. In addition to the
theoretical contributions, we propose a specialized data structure designed for the eficient computation
of result cardinality. This data structure leverages interval analysis techniques within the context of
2D orthogonal range counting. The use of this approach ofers significant performance guarantees in
terms of computational complexity and resource utilization.</p>
      <p>
        Furthermore, the paper discusses the potential for future enhancements to this framework. Indeed,
our data structure can be extended to its dynamic version [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] to obtain ( log()
2) amortized insertion
and deletion.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Palyvos-Giannas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gulisano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Papatriantafilou</surname>
          </string-name>
          , Genealog:
          <article-title>Fine-grained data streaming provenance in cyber-physical systems, Parallel Comput</article-title>
          .
          <article-title>(</article-title>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Glavic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Esmaili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tatbul</surname>
          </string-name>
          ,
          <article-title>Ariadne: managing fine-grained provenance on data streams</article-title>
          , in: S. Chakravarthy,
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Urban</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Pietzuch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Rundensteiner</surname>
          </string-name>
          (Eds.),
          <source>The 7th ACM International Conference on Distributed Event-Based Systems, DEBS '13</source>
          ,
          <string-name>
            <surname>Arlington</surname>
            ,
            <given-names>TX</given-names>
          </string-name>
          , USA - June 29 - July 03,
          <year>2013</year>
          , ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Palyvos-Giannas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tzompanaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Papatriantafilou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gulisano</surname>
          </string-name>
          ,
          <article-title>Erebus: Explaining the outputs of data streaming queries</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <article-title>(</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Palyvos-Giannas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Havers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Papatriantafilou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gulisano</surname>
          </string-name>
          ,
          <article-title>Ananke: A streaming framework for live forward provenance</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <article-title>(</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Arasu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Babu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Widom</surname>
          </string-name>
          ,
          <article-title>The CQL continuous query language: semantic foundations and query execution</article-title>
          ,
          <source>VLDB J</source>
          .
          <volume>15</volume>
          (
          <year>2006</year>
          )
          <fpage>121</fpage>
          -
          <lpage>142</lpage>
          . URL: https://doi.org/10.1007/s00778-004-0147-z. doi:
          <volume>10</volume>
          .1007/S00778- 004- 0147- Z.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. B.</given-names>
            <surname>Terry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Nichols</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Oki</surname>
          </string-name>
          ,
          <article-title>Continuous queries over append-only databases</article-title>
          , in: M.
          <string-name>
            <surname>Stonebraker</surname>
          </string-name>
          (Ed.),
          <source>Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data</source>
          , San Diego, California, USA, June 2-5,
          <year>1992</year>
          , ACM Press,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Carbone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Traub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Katsifodimos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Haridi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Markl</surname>
          </string-name>
          , Cutty:
          <article-title>Aggregate sharing for userdefined windows</article-title>
          , in: S. Mukhopadhyay,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Bertino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mostafa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Si</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          Sondhi (Eds.),
          <source>Proceedings of the 25th ACM International Conference on Information and Knowledge Management</source>
          ,
          <string-name>
            <surname>CIKM</surname>
          </string-name>
          <year>2016</year>
          , ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Carbone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Katsifodimos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ewen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Markl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Haridi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tzoumas</surname>
          </string-name>
          , Apache flink™:
          <article-title>Stream and batch processing in a single engine, IEEE Data Eng</article-title>
          . Bull. (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Armbrust</surname>
          </string-name>
          ,
          <string-name>
            <surname>T. Das</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Torres</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Yavuz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Xin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ghodsi</surname>
            , I. Stoica,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Zaharia</surname>
          </string-name>
          ,
          <article-title>Structured streaming: A declarative API for real-time applications in apache spark</article-title>
          , in: G.
          <string-name>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. M. Jermaine</surname>
            ,
            <given-names>P. A.</given-names>
          </string-name>
          <string-name>
            <surname>Bernstein</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference</source>
          <year>2018</year>
          , Houston, TX, USA, June 10-15,
          <year>2018</year>
          , ACM,
          <year>2018</year>
          , pp.
          <fpage>601</fpage>
          -
          <lpage>613</lpage>
          . URL: https://doi.org/10.1145/3183713.3190664. doi:
          <volume>10</volume>
          .1145/3183713.3190664.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Verwiebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Grulich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Traub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Markl</surname>
          </string-name>
          ,
          <article-title>Survey of window types for aggregation in stream processing systems</article-title>
          , VLDB J. (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Karvounarakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tannen</surname>
          </string-name>
          ,
          <article-title>Provenance semirings</article-title>
          , in: L.
          <string-name>
            <surname>Libkin</surname>
          </string-name>
          (Ed.),
          <source>Proceedings of the Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems</source>
          , June 11-13,
          <year>2007</year>
          , Beijing, China,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Erickson</surname>
          </string-name>
          , et al.,
          <article-title>Geometric range searching and its relatives</article-title>
          ,
          <source>Contemporary Mathematics</source>
          <volume>223</volume>
          (
          <year>1999</year>
          )
          <fpage>1</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Willard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Lueker</surname>
          </string-name>
          ,
          <article-title>Adding range restriction capability to dynamic data structures</article-title>
          ,
          <source>J. ACM</source>
          <volume>32</volume>
          (
          <year>1985</year>
          )
          <fpage>597</fpage>
          -
          <lpage>617</lpage>
          . URL: https://doi.org/10.1145/3828.3839. doi:
          <volume>10</volume>
          .1145/3828.3839.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>M. J. Sax</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Weidlich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Freytag</surname>
          </string-name>
          ,
          <article-title>Streams and tables: Two sides of the same coin</article-title>
          , in: M.
          <string-name>
            <surname>Castellanos</surname>
            ,
            <given-names>P. K.</given-names>
          </string-name>
          <string-name>
            <surname>Chrysanthis</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Chandramouli</surname>
          </string-name>
          , S. Chen (Eds.),
          <source>Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics</source>
          ,
          <string-name>
            <surname>BIRTE</surname>
          </string-name>
          <year>2018</year>
          , Rio de Janeiro, Brazil,
          <year>August 27</year>
          ,
          <year>2018</year>
          , ACM,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Lueker</surname>
          </string-name>
          ,
          <article-title>A data structure for orthogonal range queries</article-title>
          ,
          <source>in: 19th Annual Symposium on Foundations of Computer Science</source>
          (sfcs
          <year>1978</year>
          ), IEEE,
          <year>1978</year>
          , pp.
          <fpage>28</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>