-

Algorithms for Rewriting Aggregate Queries Using Views

Sara Cohen

sarina@cs.huji.ac.il 0

Werner Nutt

Werner.Nutt@dfki.de 1

Alexander Serebrenik

alicser@cs.huji.ac.il 0 0 Computer Science Dept., The Hebrew University 1 German Research Center for , Artificial Intelligence GmbH

9 9

Typical queries over data warehouses perform aggregation. One of the main ideas to optimize the execution of an aggregate query is to reuse results of previously answered queries. This leads to the problem of rewriting aggregate queries using views. More precisely, given a set of queries, called “views,” and a new query, the task is to reformulate the new query with the help of the views in such a way that executing the reformulated query over the views yields the same result as executing the original query over the base relations. Due to a lack of theory, so far algorithms for this problem were rather ad-hoc. They were sound, but were not proven to be complete. In earlier work we have given syntactic characterizations for the equivalence of aggregate queries, and applied them decide when there exist rewritings. However, these decision procedures are highly nondeterministic and do not lend themselves immediately to an implementation.

In the current paper, we refine those procedures by eliminating the nondeterminism as much as possible, thus obtaining practical algorithms for rewriting queries with the operators count and sum. It can be proved that our algorithms are complete for queries where each relation occurs only once and for queries without comparisons. We also show how algorithms for rewriting nonaggregate The copyright of this paper belongs to the paper’s authors. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage.

Proceedings of the International Workshop on Design and Management of Data Warehouses (DMDW’99)

Heidelberg, Germany, 14. - 15.6. 1999 (S. Gatziu, M. Jeusfeld, M. Staudt, Y. Vassiliou, eds.) http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-19/ queries can be modified to obtain rewriting algorithms for queries with min and max.

These algorithms are a basis for realizing optimizers that rewrite aggregate queries using views.

Introduction

Typically, queries over data warehouses are aggregate queries. Aggregate queries occur in different places in a data warehouse (DW). The ultimate purpose of a DW is to support queries by end users that want to analyze a business. They want to have a comprehensive picture of their company and ask for the number of customers, maximum sales, total profits, etc. However, aggregate queries are not only processed at the data warehouse back end. But also loading a data warehouse often requires computation of aggregates. Aggregation is a means of reducing the granularity of data. Data in a data warehouse is often coarser than the data in the operational systems that it is based on. For instance, while the operational data of a supermarket may contain each single item of a customers purchase, the DW may contain only the total sales of each item by day and store. In such a case, loading can be viewed as executing an aggregate query.

The execution of aggregate queries tends to be time consuming and costly. Computing one aggregate value often requires to scan many data items. This makes query optimization a necessity. A promising technique to speed up the execution of aggregate queries is to reuse the answers to previous queries to answer new queries. We call a reformulation of a query that uses other queries a rewriting.

Finding such rewritings is known in the literature as the problem of rewriting queries using views. In this phrasing of the problem, it is assumed that there is a set of views, whose answers have been stored, or materialized. Then, given a query, the problem is to find a rewriting, which is formulated in terms of the views and some database relations, such that evaluating the original query yields the same answers as evaluating first the views and then, based on that result, evaluating the rewriting.

Rewriting techniques are not only applicable to query

In the following query we calculate the total amount of money spent on each job position:

An intelligent query optimizer could now reason that for each type of job we can calculate the total amount of money spent on it if we multiply the salary that one TA receives for such a job by the number of positions of that type. The two materialized views contain information that can be combined to yield an answer to our query. The optimizer can formulate a new query that only accesses the views and does not touch the tables in the database:

Motivation sum ators min, max, count, and [NSS98]. We have apoptimization, but also to data warehouse design [TS97], which in a sense is the inverse of the optimization problem. Here, one assumes that a set of queries over base relations is given, together with some additional information about the frequency with which the queries are posed and the frequency with which changes to the base relations occur. One is then interested in a set of intermediate views from which the queries can be computed such that the total cost of query execution and view maintenance is minimal.

In order to find good views, one has to understand how queries can be rewritten using views.

Rewriting queries using views has been studied extensively for non-aggregate queries [LMSS95], and algorithms have been devised and implemented [LSK95, Qia96]. For aggregate queries, the problem has been investigated mainly in the special case of datacubes (see e.g., [HRU96, Dyr96]. However, there is little theory for general aggregate queries, and the rewriting algorithms that appear in the literature are by and large ad hoc. These algorithms are sound, that is, the reformulated queries they produce are in fact rewritings, but there is neither a guarantee that they output a rewriting whenever one exists, nor that they generate all existing rewritings [SDJL96, GHQ95].

Recently, we have given syntactic characterizations for the equivalence of SQL queries with the aggregate operplied them to decide, given an aggregate query and a set of views, whether there exists a rewriting, and whether a new query over views and base relations is a rewriting [CNS99].

These characterizations, however, do not immediately yield practical algorithms.

In this paper, we show how to derive such algorithms.

The algorithms are sound, i.e., they output rewritings. We can also show that they are complete for important cases, which are relevant in practice. In Section 2 we present a motivating example. A formal framework for rewritings of aggregate queries is presented in Section 3. In Section 4 we give algorithms for rewriting aggregate queries and in

Section 5 we conclude. job type sistants in a course. Each TA has a in the course At the Hebrew University, there may be many teaching ashe assists. For example, he may give lectures or grade exercises. Teaching assistants are financed by different sources, like science foundations and the university itself. For each 2 We discuss an example that illustrates the rewriting problem for aggregate queries. This example models exactly the payment policy for teaching assistants at the Hebrew University in Jerusalem. There are two tables with relations pertaining to salaries of teaching assistants: In order to evaluate the new query, we no more need to look up all the teaching assistants nor all the financing sources.

Thus, probably, the new query can be executed more efficiently.

In this example, we used our common sense in two occasions. First, we gave an argument why evaluating the original query yields the same result as evaluating the new query that uses the views. Second, because we understood the semantics of the original query and the views, we were able to come up with a reformulation of the query over the views.

Since we cannot expect a query optimizer to have common sense, we will only be able to build an optimizer that can rewrite aggregate queries, if we can provide answers to the 3.1

Non-Aggregate Queries

following two questions. How can we prove that a new query, which uses views, produces the same results as the original query? How can we devise an algorithm that systematically and efficiently finds all rewritings? If efficiency and completeness cannot be achieved at the same time, we may have to find a good trade-off between the two requirements.

A Formal Framework 3

UNION alization to queries with the constructor is possiHAVING nonnested queries without a clause and with the

In this section we define the formal framework in which we study rewritings of aggregate queries. First, we extend the well-known Datalog syntax for non-aggregate queries [Ull89] so that it covers also aggregates. This syntax is more abstract and concise than SQL. It is not only better suited for a theoretical investigation, but it is also a better basis for implementing algorithms that reason about queries, in particular for implementations in a logic programming language. Based on this syntax, we define the semantics of queries and give a precise formulation of the rewriting problem.

Through the syntax we implicitly define the set of SQLqueries to which our techniques apply. They are essentially aggregate operators min, max, count, and sum. A generble, but beyond the scope of this paper. For queries with arbitrary nesting and negation no rewriting algorithms are feasible, since equivalence of such queries is undecidable. q(x) predicate symbol. We abbreviate a query as B(s), q textbook like [Ull89]). Under bag-set semantics, defines D produces over (for a precise definition consult a standard q(x) R & the body, or, shortly, as C. The variables B(s) s where stands for the body and for the terms occurq(x) or simply by the predicate of its head q. q a new relation qD which consists of all the answers that q(x) R(s) & query as C(t), in case we want to distinq tion. Under set semantics, a conjunctive query defines D q with (for a precise definition see [CV93]). q Under set-semantics, two queries and q0 are equivaD A database contains for each predicate a finite relarelational or comparisons. If the body contains no comparisons, then the query is relational. A query is linear if it does not contain two relational atoms with the same ring in the body. Similarly, we may write a conjunctive guish between the relational atoms and the comparisons in appearing in the head are called distinguished variables, while those appearing only in the body are called nondistinguished variables. Atoms containing at least one nondistinguished variable are called nondistinguished atoms. By abuse of notation, we will often refer to a query by its head a multiset or bag ffqggD of tuples for each database D. The bag ffqggD contains the same tuples as the relation qD, but each tuple occurs as many times as it can be derived over lent if for every database, they return the same set as a result. Analogously, we define equivalence under bag-setsemantics.

In the example below, we show how to transform a query in SQL notation into one in Datalog notation. Example 3.1 Consider a query that finds the teaching assistants who have a job for which they receive more then $500 from the government: M fsum(z1 z2) that maps any multiset of pairs of numbers (m1; m2) to P(m1; m2). We call such a function an aggregation function.

An aggregate query is a conjunctive query augmented by an aggregate term in its head. Thus, it has the form SELECT clause. We simply add it to the terms in the head SELECT attributes are identical to the grouping attributes,

Example 3.2 Recall the query in Section 2 that calculates the total amount of money spent on each job type. The extension of the Datalog syntax is straightforward. Since the there is no need to single them out by a special notation.

Hence, the only new feature is the aggregate term in the of the query, after replacing the attributes with corresponding variables. Thus, the above SQL query is transformed into the following Datalog query: q It can be easily checked that the Datalog query above is equivalent to our SQL query. Obviously, one notation can be transformed into the other, back and forth, completely automatically.

GROUP BY queries with and aggregation. We restrict GROUP BY in the clause. Also, we assume that queries SELECT identical to those in the statement, although SQL

We now extend the Datalog syntax so as to capture also ourselves to SQL queries where the group by attributes are only requires that the latter be a subset of those appearing have only one aggregate term. The general case can easily be reduced to this one. q(x; (y)) gregate queries. Consider the query B(s). q(x; y) core of q, which is defined as B(s). The core is d x of constants of the same length as let x We call the grouping variables of the query. Queries with q We associate to a non-aggregate query, q, called the D (d; core returns over a bag ffqggD of tuples e). For a tuple elementary aggregate terms are elementary queries. If the aggregation term in the head of a query has the form (y), we call the query an -query (e.g., a max-query).

In this paper we are interested in rewriting elementary queries using elementary views. However, as the example in Section 2 shows, even under this restriction the rewritings may not be elementary.

We now give a formal definition of the semantics of agFor a database D, the query yields a new relation qD. To define the relation qD, we proceed in two steps. the query that returns all the values that are amalgamated in the aggregate. Recall that under bag-set-semantics, the

Equivalence Modulo a Set of Views

2This definition blurs the distinction between the function as a mathematical object and the symbol denoting the function. However, a notation that takes this difference into account would be cumbersome. 9-4 & : : : & v1( 1x1) vn( nxn); q(x) for a rewriting of has the form only considered conjunctive rewritings.3 Thus, a candidate where the i’s are substitutions that instantiate the view predicates vi(xi).

The second question is whether we can reduce reasoning about the query r, which contains view predicates, to reasoning about a query that has only base predicates. To this end, we unfold r. That is, we replace each view atom vi( ixi), with the instantiation iBi of the body of vi, where vi is defined as vi(xi) Bi. We assume that the nondistinguished variables in different bodies are distinct.

We thus obtain the unfolding ru of r, for which the Unfolding Theorem holds: q queries using predicates from R[V, we define that and q0 D v 2 V by interpreting every view predicate as the relaq V = are equivalent modulo V, written q0, if qDV q0DV q tion vD . If is a query that contains also predicates from q V, then qDV is the relation that results from evaluating

We consider aggregate queries that use predicates both from R, a set of base relations, and V, a set of view definitions. We want to define the result of evaluating such a query over a database D. We assume that a database contains only facts about the base relations.

For a database D, let DV be the database that extends over the extended database DV . If q, q0 are two aggregate for all databases D. q want to know whether there is a rewriting of using the

We review the questions related to rewriting relational conjunctive queries. Suppose, we are given a set of conjunctive queries V, the views, and another conjunctive query q. We views in V.

The first question that arises is, what is the language for expressing rewritings? Do we consider arbitrary first order formulas over the view predicates as candidates, or even recursive queries, or do we restrict ourselves to conjunctive queries over the views? Since reasoning about queries in the first two languages is undecidable, researchers have We now present techniques for rewriting aggregate queries.

Our approach will be a to generalize the known techniques for conjunctive queries. Therefore, we first give a short review of the conjunctive case and then discuss in how far aggregates give rise to more complications.

4.2.1

Rewritings of Relational Count-queries

When rewriting count-queries, we must deal with the same questions that arose when rewriting conjunctive queries. Thus, we first define the language for expressing rewritings. Even if we restrict the language to conjunctive aggregate queries over the views, we still must decide on two additional issues. First, which types of aggregate views are useful for a rewriting? Second, what will be the aggregation term in the head of the rewriting? A count-query is sensitive to multiplicities, and count-views are the only 3It is an interesting theoretical question, which as yet has not been resolved, whether more expressive languages give more possibilities for rewritings. It is easy to show, at least, that in the case at hand allowing also disjunctions of conjunctive queries as candidates does not give more possibilities than allowing only conjunctive queries. (1) 4.2

Rewritings of Count-queries

We consider the problem of rewriting count-queries. As a first step, we consider rewriting relational count-queries. We then extend our technique in order to rewrite countqueries with comparisons. 3.4

General Definition of Rewriting

ru(x) nBn: 4

Rewritings of Aggregate Queries

4.1

Reminder: Rewritings of Relational Conjunctive Queries

n ing candidates, rn, for 1:

Example 4.2 Consider aggregate queries q, v, and rewrittype of aggregate views that do not lose multiplicities4.

Thus, the natural answer to the first question is to use only count-views when rewriting count-queries. We show in the following example that there are an infinite number of aggregate terms that can be usable in rewriting a countquery. q fvg Then it is easy to see that rn is a rewriting of modulo for all n. It is natural to create only r1 as a rewriting of q. In fact, only for r1 will the Unfolding Theorem hold.

However, when creating rewritings automatically we must restrict the aggregate term in the head of the rewriting in order to prevent deriving infinitely many rewritings. count aggregate term in the rewriting with the term. This r r. We call a count-rewriting candidate. Note that in some r Thus, we obtain the unfolding ru of defined as count) where vic are count-views defined as vic (xi; Bi count is a natural choice and is also necessary since is the and zi are variables not appearing elsewhere in the body of cases, it is possible to omit the summation. This is true if the values of zi are functionally dependent on the value of x. In such a case, the summation is always over a singleton.

After presenting our rewriting candidates we now show how we can reduce reasoning about rewriting candidates, to reasoning about conjunctive aggregate queries. We use a similar technique to that shown in Subsection 4.1. We replace view atoms with the appropriate instantiations of their bodies. As in the case for relational queries, unfolding should preserve equivalence. Otherwise reduction about the reasoning is not possible. We choose to replace the only aggregate term which will preserve this characteristic. r tic. Now, instead of checking whether is a rewriting v replacing R0 with , variables that appeared in R0 and do ’ can assume, w.l.o.g. that is the identity mapping on the v q with in the body of in order to derive a partial rewrit’ R-usable(v, , w.r.t. to rewrite q. We denote this fact as v(u; count) We start by discussing when a view, Rv, v v replaced by , using ’, we say that is R-usable under u not appear in are not accessible anymore. Thus, we can v q R. Therefore, is usable for rewriting only if there exv R v order for , to be usable, must “cover” some part of q(x; count) q r R. Recall that a rewriting of is a query defined is the only candidate preserving this characterisof q, we can verify if ru is equivalent to r. It has been shown [CV93, NSS98] that two relational count-queries are equivalent if and only if they are isomorphic.

We present an algorithm that finds a rewriting for a query using views. Our approach can be thought of as reverse engineering. We have characterized the “product” that we must create, i.e., a rewriting, and we now present an automatic technique for producing it. instantiated by , is usable in order to rewrite a query, that when unfolded yields a query isomorphic to q. Thus, in ists an isomorphism, ’, from Rv to R0 R. Note that we distinguished variables of v. We would like to replace R0 ing of q. This cannot always be done. Observe that after only perform the replacement if these variables do not appear anywhere else in q, in q’s head or body. If R0 can be ’).

Example 4.3 Consider the following

p1(x0; u0; y0) 4Although sum-views are sensitive to multiplicities (i.e., are calculated under bag-set-semantics), they lose these values. For example, sumviews ignore occurrences of zero values. 9-6 q rewrite the updated version of (line 6). Thus, in each iteration of the loop, additional atoms are covered. In line 10 the algorithm checks if a nondistinguished atom is already covered. If so, then the algorithm must fail, i.e., backtrack, as explained above. The algorithm is sound and complete as stated below.

R & C and a set of views, V, a rewriting We extend the technique presented in the previous section in order to rewrite queries with comparisons. Thus, we consider the problem of rewriting queries having comparisons with views having comparisons. We augment the rewriting candidate form with comparisons. Thus given a query

Our algorithm runs in nondeterministic polynomial time. The algorithm guesses views and instantiations and then verifies that the obtained result is a rewriting in a polynomial time. This is an optimal algorithm, since the view usability problem for relational conjunctive count-queries is NP-complete [CNS99]. 4.2.2

Rewritings of Count-Queries with Comparisons

Relational Count Rewriting q(x; count) R V A query and a set of views r A rewriting of q.

Algorithm Input Output

V tional rewriting. As above, it holds that ru r. Thus, r We can unfold in the same fashion as unfolding a relar q In order to verify that is a rewriting of , we have to once again we can reduce reasoning about queries with views to reasoning about equivalent queries without views. verify that ru q.

In [NSS98], we gave a sound and complete characterization of equivalence of conjunctive count-queries. The only known algorithm that checks equivalence of conjunctive count-queries creates an exponential blowup of the queries. Thus, it is difficult to present a tractable algorithm for computing rewritings. However, equivalence of linear count-queries with comparisons is isomorphism of the queries [NSS98]. Thus, we will give a sound, complete, and tractable algorithm for computing rewritings of linear count-queries. This algorithm is also sound and tractable for the general case, but is not complete. In the sequel, in this section we assume that the comparisons in the queries we are rewriting are deductively closed. Computing a deductive closure is a well-known polynomial procedure [Klu88]. that are sensitive to multiplicities, they are useful for rewritings. However, sum-views may lose multiplicities and make the aggregation variable inaccessible. Thus, at most one sum-view should be used in the rewriting of a query. The following are rewriting candidates of the query (2) (3) C C modify in line 12. We remove from its comparisons C-usable(v, , this fact as ’).

R the appropriate subset of by the appropriate instantiation

We present an algorithm for computing rewritings of conjunctive count-queries in Figure 4.2.2. Note that we containing a variable that is not accessible after replacing of v. Thus, this step is necessary in order for the resulting query to be safe [Ull88].

The given algorithm runs in nondeterministic polynomial time. The following theorem states that it is both sound and complete for linear queries and is sound, but not complete, for arbitrary queries. sum queries. Since and count-views are the only views

Rewriting sum-queries is similar to rewriting countqueries. When rewriting sum-queries we must also take the aggregation variable into consideration. We present an algorithm for rewriting sum-queries that is similar to the algorithm for count-queries.

We define the form of rewriting candidates for sumcount) where vic is a count-view of the form vic (xi; Bi sum(y)) and vs is a sum-view of the form vs (xs; Bs . y Note that the variable in the head of the query in Equation 3 must appear among ixi for some i. In [CNS99] we showed that if a rewriting candidate is equivalent to its unfolding then it must be one of the above forms. As in the case of count-query rewritings, in some cases the rewriting may be optimized by dropping the summation.

Once again, we reduce reasoning about rewriting candidates to reasoning about conjunctive aggregate queries. For this purpose we extend the unfolding technique introduced in Subsection 4.2. Thus, the unfoldings of the candidates presented are: Com- may be chosen as well. We call this algorithm pute Rewriting. We derive an algorithm for rewriting r q Now, instead of checking whether is a rewriting of we can verify if ru is equivalent to r. However, the only known algorithm for checking equivalence of sum-queries, presented in [NSS98], requires an exponential blowup of the queries. Thus, it might be very difficult to provide a tractable algorithm that is both sound and complete for arbitrary sum-queries. However, relational sum-queries and linear sum-queries are equivalent if and only if they are isomorphic. Thus, we can extend the algorithm presented in the Figure 4.2.2 for sum-queries.

As a preliminary step for our algorithm we extend the algorithm in Figure 4.2.2, such that in line 5 sum-views sum-queries, presented in Figure 4.3. The algorithm runs in nondeterministic polynomial time and the following holds:

Rewritings of Sum-Queries

Example 4.6 This example shows the incompleteness of the algorithm for the general case. Consider the query q, view v, rewriting r, and unfolding ru

Algorithm Input Output

Count Rewriting q(x; count) A query r A rewriting of q. http://www.cs.huji.ac.il/ versity and it is located at HAVING clauses, negation and functional dependencies, Aggregate queries are increasingly prevalent due to the widespread use of data warehousing for decision support.

They are generally computationally expensive since they scan many data items, while retrieving few. Thus, the computation time of aggregate queries is generally orders of magnitude larger than the result size of the query.

This makes query optimization a necessity. Optimizing aggregate queries was studied in the context of datacubes [HRU96, Dyr96]. However, there was little theory for general aggregate queries, beyond this context. In this paper, based on previous results in [NSS98, CNS99], we presented algorithms that enable reuse of precomputed queries in answering new ones.

The algorithms presented were implemented in SICStus Prolog. The system was developed at Hebrew Uni~alicser/aggrq/.

Topics for future research include rewriting queries with and enriching the class of aggregate functions with statistical functions. (4) (5) queries can be modified to find rewritings of max-queries.

Rewriting nonaggregate queries is a well known problem [LMSS95]. Thus, we do not present algorithms for finding rewritings of max-queries in this paper. 5 Conclusion y Note that the variable in the head of the query in Equawhere vi is a nonaggregate view and vm is a max-view. tion 5 must appear among ixi for some i. In [CNS99] we showed that if a rewriting candidate is equivalent to its unfolding then it must have one of the above forms.

Once again, reasoning about rewriting candidates can be reduced to reasoning about max-queries, using an appropriate extension of the unfolding technique. We have shown [NSS98] that equivalence of relational max-queries is equivalence of their cores. There is a similar reduction for the general case. Thus, algorithms developed for checking set-equivalence of queries can easily be converted to algorithms for checking equivalence of max-queries.

Similarly, algorithms that find rewritings of nonaggregate 4.4

Rewritings of Max-Queries

We consider the problem of rewriting max-queries. Note that max-queries are insensitive to multiplicities. Thus, it is natural to use nonaggregate views and max-views when rewriting a max-query. When using a max-view the aggregation variable becomes inaccessible. Thus, we use at most one max-view. The following are rewriting candidates of the query q:

Algorithm Input Output

count) Let q0 be the query q0(x;

Let r0=Compute Rewriting(q0; V).

If r0 is of the form sum(y r0(x; & : : : & & r0(x; sum(Qin=1 zi)) v1c( 1x1; z1) vnc( nxn; zn) C0 y and appears among ixi

Then return r0

If r0 is of the form

Then return

[CM77]

S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In Ch. Papadimitriou, editor, Proc. 18th Symposium on Principles of Database Systems, Philadelphia (Pennsylvania, USA), May 1999. ACM Press.

To appear.

S. Chaudhuri and M. Vardi. Optimization of real conjunctive queries. In Proc. 12th Symposium on Principles of Database Systems, Washington (D.C., USA), May 1993. ACM Press.

C. Dyreson. Information retrieval from an incomplete datacube. In Proc. 22nd International Conference on Very Large Data Bases, Bombay (In dia), September 1996 . Morgan

Kaufmann Publishers. [GHQ95] A. Gupta, V. Harinarayan, and D. Quass. Aggregate query processing in data warehouses.

In Proc. 21st International Conference on Very Large Data Bases. Morgan Kaufmann Publishers, August 1995.

V. Harinarayan, A. Rajaraman, and J. Ullman.

Implementing data cubes efficiently. In Proc. 1996 ACM SIGMOD International Conference on Management of Data, pages 205–227, Montreal (Canada), June 1996. [JK83] [Klu88]

D.S. Johnson and A. Klug. Optimizing conjunctive queries that contain untyped variables.

SIAM Journal on Computing, 12(4):616–640, 1983.

A. Klug. On conjunctive queries containing inequalities. J. ACM, 35(1):146–160, 1988. [LMSS95] A.Y. Levy, A.O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views.

In Proc. 14th Symposium on Principles of Database Systems, pages 95–104, San Jose (California, USA), May 1995. ACM Press. [LSK95] [NSS98] [Qia96]

A.Y. Levy, D. Srivastava, and T. Kirk. Data

model and query evaluation in global information systems. Journal of Intelligent Information Systems, 5(2):121–143, 1995.

W. Nutt, Y. Sagiv, and S. Shurin. Deciding equivalences among aggregate queries. In

J. Paredaens, editor, Proc. 17th Symposium on Principles of Database Systems, pages 214– 223, Seattle (Washington, USA), June 1998. ACM Press. Long version as Report of Esprit LTR DWQ.

X. Qian. Query folding. In Stanley Y. W. Su,

editor, Proc. 12th International Conference on Data Engineering, pages 48–55, New Orleans (Louisiana, USA), March 1996. IEEE Computer Society. [SDJL96] D. Srivastava, Sh. Dar, H.V. Jagadish, and A.Y.

Levy. Answering queries with aggregation using views. In Proc. 22nd International Conference on Very Large Data Bases, Bombay (In9-10 [TS97] [Ull88] [Ull89]

dia) , September 1996 . Morgan Kaufmann Publishers.

Theodoratos and

T.K.

Sellis . Data warehouse configuration . In Proc. 23nd International Conference on Very Large Data Bases , pages 126 - 135 , Athens (Greece), August 1997 . Morgan Kaufmann Publishers.

Ullman . Principles of Database and Knowledge-Base Systems , Vol. I: Classical Database Systems . Computer Science Press, New York (New York, USA), 1988 .

Ullman . Principles of Database and Knowledge-Base Systems , Vol. II: The New Technologies . Computer Science Press, New York (New York, USA), 1989 .