<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Cost Models for Approximate Query Evaluation Algorithms</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oxana DOLMATOVA</string-name>
          <email>oxana.dolmatova@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anna YARYGINA</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Boris NOVIKOV</string-name>
          <email>borisnov@acm.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Saint Petersburg State University</institution>
        </aff>
      </contrib-group>
      <fpage>20</fpage>
      <lpage>28</lpage>
      <abstract>
        <p>The optimization is essential for any high-performance querying system. Several optimization techniques were developed and successfully implemented for relational databases. However, these techniques should be re-examined and revised for distributed heterogeneous systems of information resources supporting diverse querying paradigms. We introduce cost models for approximate query evaluation in the context of generalized algebraic operations supporting both exact and similarity queries. The proposed cost models are suitable for approximate evaluation and trade-off between computational performance and the quality of results. We present a rationale for our approach and elaborate our cost model for key operations and algorithms.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>Cost models</kwd>
        <kwd>approximate algorithms</kwd>
        <kwd>query evaluation</kwd>
        <kwd>heterogeneous systems</kwd>
        <kwd>information resources</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The presence of high-level declarative query languages was considered as one of major
strengths of database management systems since early 70-ies and became an inherent
feature of the relational database model and its variations are implemented in the
industrial SQL-based systems.</p>
      <p>Providing highly expressive declarative means for query specification, these
languages both require and enable sophisticated optimization. In contrast with
lowlevel imperative object-oriented programming languages, which allow only moderate
code improvements with local optimization, declarative languages depend dramatically
on the quality of the optimizer.</p>
      <p>Formally, the task of a query optimizer is to choose an algebraic expression of
minimal execution cost among several equivalent expressions. In other words, any
high-quality optimizer is inevitably a cost-based one and, hence, the cost model is one
of the critical core components of the optimizer.</p>
      <p>The abstract concept of cost may include several different measures of query
execution performance. Usually the most important is execution time (either CPU or
elapsed), amount of I/O, or, in a distributed mobile environment, the battery energy.
However, the actual interpretation of cost is not essential for the optimization process.</p>
      <p>There are many optimization techniques based on cost models, but in general
almost all cost models were prepared for exact query evaluation algorithms.</p>
      <p>In broad modern contexts, such as distributed heterogeneous systems, real-time
business analytics and distributed mobile environments an approximate query
evaluation becomes a must. Indeed, it does not make any sense to use exact query
evaluation in uncertain context or for similarity-based queries, where “top k” approach
is the best. Approximate query evaluation is ultimately needed to provide timeliness for
business analytics or save energy in a mobile device.</p>
      <p>As the query evaluation is approximate, the quality of the output may decrease.
Our main objective is to provide a cost model capable to support trade-off between the
cost of evaluation and the quality of results. Based on our cost model, the query
optimizer can either provide best possible quality for a given cost, or minimize the cost
providing at least required quality of the query output.</p>
      <p>We start from naïve exact cost models and proceed with detailed analysis of
approximate operations. We define models for a number of operations including “top k”
operations for single and multiple arguments.</p>
      <p>In this research we consider query processing in a heterogeneous environment of
autonomous information resources. In this type of environments data statistics might be
unavailable, hence simple cost models are needed for both of exact and approximate
querying.</p>
    </sec>
    <sec id="sec-2">
      <title>1. Cost Model Specification</title>
      <p>In this paper we discuss query processing in a heterogeneous environment of
autonomous information resources where all objects have scores which represent their
relevance, similarity, quality and so on. Each operation takes as an input a stream of
objects with scores and pushes into the output another stream of objects with their
scores.</p>
      <p>Further the specification of the cost model that supports the concept of operation
result quality is presented and is followed by brief discussion of main operations and
corresponding algorithms which are analyzed in this paper.</p>
      <p>Operation cost model depends on 4 basic parameters: data size; data quality; data
order; and the amount of resources needed for operation execution (operation cost).</p>
      <p>On the one hand in traditional cost models the operation cost (amount of resources
for its evaluation) can be estimated based on the size of arguments, order of objects in
an argument. On the other hand, when we talk about approximate algorithms, the
operation behavior depends on the quality of the arguments, the size of the arguments,
order of objects in the arguments, and the amount of resources allocated for operation
execution.</p>
      <p>Thus, an operation cost model is defined as follows:
opcost(resources, sizein,orderin)=(qualityout,sizeout,orderout), where
• opcost is the operation cost function, opcost connects the values of the input
and output parameters;
• resources are resources allocated for the operation processing or the operation
cost;
• sizein is the operation input arguments size;
• orderin is the operation input arguments order;
• qualityout is the operation execution relative result quality, the relative quality
of the result of the corresponding subquery;
• sizeout the result size;
• orderout the result order;
• resources, sizein, orderin, qualityout, sizeout, orderout are objects with attributes
which represent different measures of the corresponding parameter of the cost
model and are discussed in details below.</p>
      <p>Let us note that sizein and orderin are vectors of objects that describe all arguments
of the operation.</p>
      <p>The equation describes the cost model and binds all its variables and parameters.
Thus, substituting in the equation the values of the variables, we can express and
evaluate the remaining parameters of the cost model.</p>
      <p>Apart from the basic cost model equations there are some restrictions on the cost
models of the specific operations:
resourcesmin≤resources≤resourcesmax</p>
      <p>qualitymin≤quality≤qualitymax</p>
      <p>If the operation has resourcesmax resources available, the result has the best
possible quality with given arguments, that is relative qualityout=qualitymax.</p>
      <p>If the operation has resourcesmin resources available, the result has the worst
possible relative quality (qualityout=qualitymin) with given arguments. In this case the
algorithm spends the least possible amount of resources for operation execution.</p>
      <p>Our cost models suggest the estimation of the unknown parameters. We assume
that objects are not ordered in an input or output data if no information about order is
available. The value of the quality and size parameters are estimated based on the
history of the previous queries and collected statistics.</p>
      <p>Further we discuss how to measure resources and data characteristics in our cost
models.</p>
      <sec id="sec-2-1">
        <title>1.1. Resources</title>
        <p>Resources needed for a particular operation processing can be measured and evaluated
in different ways. For example the operation processing cost can be estimated by its
execution time. However, this approach restricts the proposed cost model to a specific
query processing system.</p>
        <p>In this research we evaluate the cost of operations in terms of the number of disk
accesses, sequential accesses, executions of external operations, and so on. It is
important to note that in this case, the cost model is more general. In this case the
proposed cost model can be tuned depending on the specific characteristics of the query
processing engine.</p>
        <p>There are several characteristics and measure resources:
•
•
•
•
•
•
sa the number sequential accesses to read objects from pipe;
dsa the number of sequential disk accesses to read objects;
ma the number of random memory accesses;
da the number of random disk accesses;
pred the number of predicate evaluations;
ha the number of hash table accesses.</p>
        <p>The operation cost or the amount of resources needed for its execution will be
evaluated in terms of these measures. The cost of each parameter in terms of time (Tsa,
Tdsa, Tma, Tda, Tpred), which depends on the system configuration, will be estimated
based on the experiments and available statistics. Thus the operation cost can be simply
evaluated:</p>
        <p>time = sa*Tsa+dsa*Tdsa+ma*Tma+da*Tda+pred*Tpred+ha*Tha
1.2. Size</p>
      </sec>
      <sec id="sec-2-2">
        <title>1.3. Quality</title>
        <p>The size of the input data, as well as the result sizes can be evaluated at least in two
different ways: size, which data occupy in memory (size), and the number of objects,
semantic units, which the operation processes (cardinality). These characteristics
influence the behavior of algorithms that implement operations.</p>
        <p>We distinguish absolute and relative quality which operations produce.</p>
        <p>The absolute quality shows how the produced result suites to the user expectations.
To measure the absolute quality we need to obtain the actual relevance scores which
are usually not know in advance and sometimes at all.</p>
        <p>The relative quality shows how the limited, approximate implementation of an
operation changes the absolute quality of the result. In our cost models we operate with
relative quality of the operations and queries. The most important property of the
relative quality of a query is its monotony on the amount of resources allocated for its
processing. The monotony of the relative quality depends on its definition and
algorithms implementing approximate operations.
1.4. Order
2.1. Top k
The order of objects in the input data influences the operation execution cost. At this
phase of work, the order of objects is defined as the order of their scores in a data set.
The cost models depend on the fact whether the objects in the input data are naturally
ordered according to their scores.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Operations and Algorithms</title>
      <p>We consider three operations in this paper. Exact and approximate algorithms
implementing them are discussed in this section and corresponding cost models are
developed and analyzed further.</p>
      <p>Top k operation takes as an input the stream of objects with scores and returns to the
output the stream of best k objects according to their input scores. The naïve exact
algorithm orders objects according to their scores if needed and returns k first of them.
An approximate top k algorithm reads objects sequentially, while free time for
operation execution is available, and pastes them into the sorted list of already
processed objects. When the resources are over first k objects are set to be the result.</p>
      <sec id="sec-3-1">
        <title>2.2. Fusion</title>
      </sec>
      <sec id="sec-3-2">
        <title>2.3. Aggregation</title>
        <p>Binary fusion operation processes two input streams of objects with scores and returns
stream of received objects with newly generated aggregated scores. The naïve exact
algorithm implementing fusion is based on nested loops.</p>
        <p>
          The aggregation operation is the same as defined in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The operation takes as an input
two streams, where objects are naturally ordered according to their scores, and returns
to the output best k objects according to the aggregated scores based on the input ones.
The naïve exact algorithm is a simple combination of fusion and top k operation.
However, effective and optimal algorithms for aggregation operation were developed
in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]: FA (Fagin’s Algorithm) and NRA (No Random Access Algorithm). Because the
input streams are coming from two independent sources and random accesses in the
first algorithm can be expensive and sometimes impossible, they have been removed
from the implementation of algorithms and replaced with sorted accesses.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Cost Models for Exact Algorithms</title>
      <p>Let's describe a basic cost models for different operations in case when the relative
quality of the arguments and the result of the operation as the best possible and is not
regulated from the outside.</p>
      <p>Actually cost models depend on algorithms implementing this operations rather
than operations themselves. Here we describe cost models for natural exact operation
algorithms discussed in section 3.</p>
      <p>All restrictions will be defined based on the cost model described by the equation:
opcost(resources,sizein,orderin)=(qualityout,sizeout,orderout).
3.1. Top k
For exact top k operation the cost model can be defined as follows:
cardinalityout=min{k, cardinalityin}
orderout=true
cardinalityin≥cardinalityout
sizein≥sizeout
if orderin=false then
sa=cardinalityin
ma=C*cardinalityin*ln(k)
if orderin=true then</p>
      <p>sa=cardinalityout</p>
      <sec id="sec-4-1">
        <title>3.2. Fusion</title>
        <p>Here we consider the binary fusion operation. In this case sizein,orderin represent
twodimensional vectors, since the fusion operation takes two arguments. For fusion
operation (based on nested loop algorithm) we have the following cost model:</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.3. Aggregation</title>
        <p>First of all we describe the intermediary parameters which can help us to construct
formulas:
•
•
•
t – estimated size of a table, which shows all objects with already read
scores;
ns – estimated number of scores needed to fill the table of size t;
ks – estimated number of scores needed to fill the table of size k.</p>
        <p>We assume independence of scores and uniform distribution and use expectation
for evaluate these parameters.</p>
        <p>orderin=true
cardinalityout=min{k, cardinalityin}
orderout=true
cardinalityin≥cardinalityout
sizein≥sizeout
3.3.1. FA
3.3.2. NRA
time = cardinalityinTsa+(tlog t+4t)Tma+tTha</p>
        <p>Summands include sorted access, sort, input/output and calculation of aggregate
score as well as search for a place to insert a score just obtained from stream
respectively.</p>
        <p>After evaluation and substitution of intermediary parameters we obtain the
following:</p>
        <p>time=((3/4ln(cardinalityin)+3/2)Tma+3/4Tha)k+Tmacardinalityin+1/4cardinalityin</p>
      </sec>
      <sec id="sec-4-3">
        <title>Tha+1/4cardinalityin(ln(cardinalityin)–2)Tma+cardinalityinTsa time=cardinalityinTsa+((ns-ks)/2tlog t+2t(2+t))Tma+tTha</title>
        <p>Summands include sorted access, sort, input/output and calculation of worst case
score and best case score as well as search for a place to insert a score just obtained
from stream respectively.</p>
        <p>After evaluation and substitution of intermediary parameters we obtain the
following (cin denotes as cardinalityin below):</p>
        <p>time=(45/32-9/64ln(cin))Tmak2+((3/4cin+3-3/64cin(ln(cin)–2)+3/16cin(3/4ln(cin)–
3/2)Tma+3/4Tha)k+(1/2cin(2+1/4cin)+3/64cin 2(ln(cin)–2)Tma+cinTsa+1/4cinTha</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Cost Models for Approximate Algorithms</title>
      <sec id="sec-5-1">
        <title>4.1. Aggregation</title>
        <p>Now we add the specific resource t0 – time operation execution. The problem is to
estimate the result quality.</p>
        <p>We create approximate model based on exact one. Let’s k’ denote the number of
correct object returned by approximate algorithm. We express it in terms of t0. Hence
we get number of scores which system can find for the given time. Then we consider
ratio between k’ and k which means the accuracy of result.</p>
        <p>qualityout=k’/k</p>
        <p>For FA we obtain the following:
k’=(t0-Tmacardinalityin-1/4cardinalityinTha</p>
      </sec>
      <sec id="sec-5-2">
        <title>1/4cardinalityin(ln(cardinalityin)–2)Tma-cardinalityinTsa)/((3/4ln(cardinalityin)+3/2)Tma+ 3/4Tha).</title>
        <p>We will not give the inverse formula for NRA here because of the limits of paper
size.
4.2. Top k</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Related Work</title>
      <p>The query optimization became both required and enabled since the advent of
highlevel declarative query languages, mostly in the context of the relational database
model.</p>
      <p>
        A brief overview of classical query optimization techniques can be found in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
The optimization techniques for distributed systems are summarized in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. An
optimizer for distributed heterogeneous systems is proposed in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>The cost models proposed in this research are designed similar to those needed to
traditional optimizers.</p>
      <p>
        The optimization strategy based on algebraic equivalences between similarity
based operations that serve as rewrite rules is outlined in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Optimization rules based
on similarity based algebraic framework properties and equivalence laws are also
discussed in [
        <xref ref-type="bibr" rid="ref1 ref12 ref14 ref5">1, 14, 5, 12</xref>
        ].
      </p>
      <p>It is important to mention that the lack of algebraic equivalences pushes the
research to the development of optimization techniques based on performance/quality
tradeoffs and approximate algorithms.</p>
      <p>
        The approximate query evaluation techniques were considered in the context of
very large data warehouses and mobile networks [
        <xref ref-type="bibr" rid="ref2 ref3 ref6 ref8">2, 6, 3, 8</xref>
        ]. The approximation is
typically based on sampling, wavelets, or synopsis.
      </p>
      <p>
        Handling of time constraints on complex SQL queries is proposed in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The
authors distinguish approximate (based on sampling) and partial (top k) query
evaluation.
      </p>
      <p>
        The quality and performance trade-offs for stream processing are discussed in
[
        <xref ref-type="bibr" rid="ref16 ref9">16, 9</xref>
        ].
      </p>
      <p>
        Cost models based on estimation of operation selectivity and cardinality are
introduced in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for the selected set of operations: union, intersection, and difference;
joins; merge; subtract; select; and map. Optimization rules based on the query tree
reconstruction are derived from the previous analysis.
      </p>
      <p>
        A sampling-based method to estimate the cardinality of rank-aware operators is
developed for costing plans [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        A novel multi-criteria query optimization techniques for performing query
optimization in databases, such as multimedia and web databases, which rely on
imperfect access mechanisms and top-k predicates are proposed in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The size and
quality factors are introduced into the cost model and optimization algorithms.
      </p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>In this paper we presented several cost models for both exact and approximate query
evaluation algorithms in distributed heterogeneous systems. The more resource we use,
the more accurate the result is and vice versa. Our models convert these intuitive
observations into clear and distinct formulas. It provides formalism for trade-off
between quality and performance.</p>
      <p>In the future we are going to conduct experiments in order to estimate the accuracy
of our cost models.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Adali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bonatti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Sapino</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Subrahmanian</surname>
          </string-name>
          ,
          <article-title>A multi-similarity algebra</article-title>
          .
          <source>In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD '98)</source>
          , New York, NY, USA. ACM,
          <year>1998</year>
          ,
          <fpage>402</fpage>
          -
          <lpage>413</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Babcock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          , and
          <string-name>
            <surname>G. Das</surname>
          </string-name>
          ,
          <article-title>Dynamic sample selection for approximate query processing</article-title>
          .
          <source>In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD'03)</source>
          , New York, NY, USA. ACM,
          <year>2003</year>
          ,
          <fpage>539</fpage>
          -
          <lpage>550</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          , G. Das, and
          <string-name>
            <given-names>V.</given-names>
            <surname>Narasayya</surname>
          </string-name>
          ,
          <article-title>Optimized stratified sampling for approximate query processing</article-title>
          ,
          <source>ACM Transactions on Database Systems</source>
          <volume>32</volume>
          (
          <issue>2</issue>
          ) (
          <year>2007</year>
          ),
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ramakrishnan</surname>
          </string-name>
          , and
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Weikum, Integrating DB and IR technologies: what is the sound of one hand clapping?</article-title>
          <source>In: Proceedings of Second Biennial Conference on Innovative Data Systems Research</source>
          , Asilomar, CA, USA, January 4-
          <issue>7</issue>
          ,
          <year>2005</year>
          ,
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ciaccia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Montesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Penzo</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Trombetta</surname>
          </string-name>
          ,
          <article-title>Imprecision and user preferences in multimedia queries: a generic algebraic approach</article-title>
          .
          <source>In: Foundations of Information and Knowledge Systems, Proceedings of First International Symposium (FoIKS'</source>
          <year>2000</year>
          ), Burg, Germany,
          <source>February 14-17</source>
          ,
          <year>2000</year>
          .
          <source>Lecture Notes in Computer Science</source>
          <volume>1762</volume>
          (
          <year>2000</year>
          ), Springer, Berlin,
          <fpage>50</fpage>
          -
          <lpage>71</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Dell'Aquila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Di Tria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Lefons</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Tangorra</surname>
          </string-name>
          ,
          <article-title>Accuracy estimation in approximate query processing</article-title>
          .
          <source>In: Proceedings of the 14th WSEAS International Conference on Computers: Part of the 14th WSEAS CSCC Multi-Conference (ICCOMP'10) 1</source>
          ,
          <string-name>
            <surname>Stevens</surname>
            <given-names>Point</given-names>
          </string-name>
          , Wisconsin, USA,
          <year>2010</year>
          .
          <source>World Scientific and Engineering Academy and Society (WSEAS)</source>
          ,
          <fpage>452</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lotem</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Naor</surname>
          </string-name>
          ,
          <article-title>Optimal aggregation algorithms for middleware</article-title>
          ,
          <source>Journal of Computer and System Sciences</source>
          <volume>66</volume>
          (
          <issue>4</issue>
          ) (
          <year>2003</year>
          ),
          <fpage>614</fpage>
          -
          <lpage>656</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sundara</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Srinivasan</surname>
          </string-name>
          ,
          <article-title>Supporting time-constrained SQL queries in Oracle</article-title>
          .
          <source>In: Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB'07)</source>
          ,
          <source>VLDB Endowment</source>
          ,
          <year>2007</year>
          ,
          <fpage>1207</fpage>
          -
          <lpage>1218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Ch. Jermaine</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Arumugam</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Pol</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Dobra</surname>
          </string-name>
          ,
          <article-title>Scalable approximate query processing with the DBO engine</article-title>
          ,
          <source>ACM Transactions on Database Systems</source>
          <volume>33</volume>
          (
          <issue>4</issue>
          ) (
          <year>2008</year>
          ),
          <volume>23</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          :
          <fpage>54</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <article-title>A Framework for Supporting Quality of Service Requirements in a Data Stream Management System</article-title>
          .
          <source>PhD thesis</source>
          , Arlington,
          <string-name>
            <surname>TX</surname>
          </string-name>
          , USA,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kossmann</surname>
          </string-name>
          ,
          <article-title>The state of the art in distributed query processing</article-title>
          ,
          <source>ACM Computer Survey</source>
          <volume>32</volume>
          (
          <issue>4</issue>
          ) (
          <year>2000</year>
          ),
          <fpage>422</fpage>
          -
          <lpage>469</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kossmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Stocker</surname>
          </string-name>
          ,
          <article-title>Iterative dynamic programming: a new class of query optimization algorithms</article-title>
          ,
          <source>ACM Transactions on Database Systems</source>
          <volume>25</volume>
          (
          <issue>1</issue>
          ) (
          <year>2000</year>
          ),
          <fpage>43</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Ch</surname>
            . Li,
            <given-names>Kevin</given-names>
          </string-name>
          <string-name>
            <surname>Ch</surname>
          </string-name>
          .
          <article-title>-Ch.</article-title>
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>I. F.</given-names>
          </string-name>
          <string-name>
            <surname>Ilyas</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>RankSQL: query algebra and optimization for relational top-k queries</article-title>
          . In: F. Ozcan, editor,
          <source>SIGMOD Conference. ACM</source>
          ,
          <year>2005</year>
          ,
          <fpage>131</fpage>
          -
          <lpage>142</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>L. P.</given-names>
            <surname>Mahalingam</surname>
          </string-name>
          and
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Candan</surname>
          </string-name>
          <article-title>, Multi-Criteria Query Optimization in the Presence of Result Size and Quality Tradeoffs, Multimedia Tools</article-title>
          and
          <source>Applications Journal</source>
          <volume>23</volume>
          (
          <issue>3</issue>
          ) (
          <year>2004</year>
          ),
          <fpage>167</fpage>
          -
          <lpage>183</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>D.</given-names>
            <surname>Montesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Trombetta</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Dearnley</surname>
          </string-name>
          ,
          <article-title>A similarity based relational algebra for web and multimedia data</article-title>
          ,
          <source>Information Processing and Management</source>
          <volume>39</volume>
          (
          <issue>2</issue>
          ) (
          <year>2003</year>
          ),
          <fpage>307</fpage>
          -
          <lpage>322</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pentaris</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          ,
          <article-title>Query optimization in distributed networks of autonomous database systems</article-title>
          .
          <source>ACM Transactions on Database Systems</source>
          <volume>31</volume>
          (
          <issue>2</issue>
          ) (
          <year>2006</year>
          ),
          <fpage>537</fpage>
          -
          <lpage>583</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Koudas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ch. Ooi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Streaming multiple aggregations using phantoms</article-title>
          ,
          <source>The VLDB Journal</source>
          <volume>19</volume>
          (
          <issue>4</issue>
          ) (
          <year>2010</year>
          ),
          <fpage>557</fpage>
          -
          <lpage>583</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>