Logical Multidimensional Database Design for Ragged and Unbalanced Aggregation Hierarchies Tapio Niemi Jyrki Nummenmaa Department of Computer and Information Department of Computer and Information Sciences, University of Tampere Sciences, University of Tampere FIN-33014 University of Tampere, FIN-33014 University of Tampere, Finland Finland tapio@cs.uta.fi jyrki@cs.uta.fi Peter Thanisch Department of Computer Science, University of Edinburgh Edinburgh, EH9 3JZ, Scotland pt@dcs.ed.ac.uk Abstract modelling methods of these hierarchies differ from the methods used in database design. Thus, dependencies Research on logical design of OLAP cubes has used in database design are not enough for modelling tended to assume that the rollup hierarchy in a OLAP hierarchies. In this work our aim is to study which cube dimension takes the form of a balanced kinds of dependencies would be needed to get aggrega- tree. However, experience from industry tion hierarchies more desirable for OLAP. indicates that a much broader class of rollup Much of the research on logical design for OLAP hierarchies is of interest in practice. For cubes has concentrated on the class of aggregation example, the hierarchy tree might be unbalanced hierarchies that arise from star and snowflake schemata. or ragged (or both), or indeed the hierarchy In such aggregation hierarchies: might not be a tree at all, but a more general (1) the attribute labelling a column in a dimension table acyclic directed graph structure such as a lattice. will correspond to a level in the aggregation hierarchy We demonstrate how dependency information of the corresponding cube dimension, and can assist in the design of aggregation (2) data values occurring in that column of the dimension hierarchies. In addition to dependencies familiar table will become the members of the corresponding from relational database theory, we use new level. classes of dependencies to extend logical design OLAP practitioners need a much more general notion of principles so that they include this more general aggregation hierarchies than this. For example, an notion of cube dimension hierarchies. aggregation hierarchy in one of the cube’s dimensions might be built from a hierarchical relationship that is 1 Introduction modelled in one of the database tables. Example 1 illustrates this. Hierarchical dimensions are an essential part of On-Line Analytical Processing (OLAP). The hierarchical structure Example 1 Consider a management hierarchy in an allows the user to study facts in different levels of details. Employee table in which each row contains the Good hierarchical structures ensure correct and efficient information pertaining to a particular employee and one calculation of aggregation functions: consistent and column in the table contains the employee number coherent structure of hierarchies decreases logical errors (attribute EmpID) and another contains the number of while efficiency of calculation is increased, for example, that employee’s manager (attribute MgrID). In the OLAP by decreasing redundancy in rollup paths. Logical cube, we have Salary as the measure and an Employee dimension. The aggregation hierarchy in the Employee The copyright of this paper belongs to the paper’s authors. Permission to copy dimension is the management hierarchy modelled by the without fee all or part of this material is granted provided that the copies are not EmpID and MgrID columns. This aggregation hierarchy made or distributed for direct commercial advantage. allows us to query the total salary bill for all employees Proceedings of the International Workshop on Design and below any given manager. Management of Data Warehouses (DMDW’2001) Interlaken, Switzerland, June 4, 2001 We note that in Example 1, we do not have levels (D. Theodoratos, J. Hammer, M. Jeusfeld, M. Staudt, eds.) corresponding to attributes. Rather, levels correspond to data values occurring in the table. Thus two quite http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-39/ T. Niemi, J. Nummenmaa, P. Thanisch 7-1 distinct kinds of aggregation hierarchies can arise in dimension can have hierarchies like day-week, and practice and we refer to these as “attribute hierarchies” day-month-year. The third item, correct aggregation of and “data hierarchies”. data, ensures that aggregated values are not misleading. With regard to the logical design of cubes, much Good structure for dimension hierarchies is essential in research has been published on the role of dependencies, guaranteeing this. The idea of the fourth item, support for such as functional dependencies, on the design of non-strict hierarchies, means that many to many attribute hierarchies. In the present paper, we demon- relationships between different levels in a dimension are strate that dependency theory is also relevant to cube allowed. Finally, the last item, data with different levels design for data hierarchies. In addition to the functional of granularity should be allowed, means that it should be dependencies, we give three new dependency classes for possible to store data in different levels of hierarchy, not data hierarchies. The anti-closure dependency prevents only to the lowest level. calculating redundant aggregation values by disallowing Jagadish et al. [JLS99] argue that balanced and tree “shortcuts” in hierarchies, while non-raggedness and structured hierarchies are not enough for many real balance dependencies ensure complete aggregations on situations but more flexible structures are needed. Zurek each level in a hierarchy. and Sinnwell [ZuSi99] list some non-standard require- Correct summarizability is the most essential design ments for data warehouses. Especially, realignment of a goal for dimension hierarchies in OLAP. There are three data warehouse schema is needed quite often since necessary conditions for summarizability and these dimension hierarchies can change even regularly. conditions are also assumed to be sufficient [LeSh97]: Hurtado et al. [HMV99] study how data cubes can be 1. disjointness of category attributes, maintained under dimension updates. They have noticed 2. completeness, and that dimensions are not so static than it is often assumed, 3. correct use of measure (summary) attributes with thus efficient maintaining of dimensions can remarkably statistical functions. increase the efficiency of the system. A formal model for Disjointness requires that attributes in dimensions form dimension updates is presented. The model covers both disjoint subsets over the elements. Completeness means domain updates and structural updates of dimensions. An that all elements exist and every element is assigned to algorithm to maintain a relational stored data cube under some category on the level above it in the hierarchy. dimension updates is also presented. Correct use of measure attributes with statistical functions Lehner et al. [LAW98] define the dimensional normal is the most complex of these requirements since it form, which is a normal form for OLAP hierarchies. A depends on the type of the attribute and the type of the dimension is in dimensional normal form if there is a statistical function. Without any prior knowledge about functional dependency from each lower level to the level the semantics of the attributes, we can verify the first two above it. In many real situations, however, the dimen- conditions by using information on dependencies among sional normal form may be too strict. The authors also attributes in a dimension. study a problem related to attributes that are not We have organised the rest of the paper as follows. applicable for all instances, for example, all products do Next a review of the related work is given in Section 2. not have a colour. After that we present the basis of hierarchy types in Khardon et al. [KMR99] have studied Boolean Section 3. In Section 4 the new dependencies are studied. dependencies in a context of example based reasoning. Finally, the conclusions are presented in Section 5. Boolean dependencies are a generalisation to the functional dependencies [Cod72]. While the functional 2 Related Work dependency has a form X→Y, where X and Y are sets of attributes, the Boolean dependencies allow us to use Pedersen and Jensen [PeJe99] discuss requirements for Boolean operators between attributes. Boolean dependen- multidimensional data models. Five of their nine cies can be extended from simple equality comparison of requirements are for dimension hierarchies: values to compare whether the values belong to the same 1) explicit hierarchies in dimensions, equivalence class. This makes it possible to define 2) support for multiple hierarchies in a dimension, constraints between value groups, e.g. age → age group. 3) correct aggregation of data, Wijsen et al. [WNC99] study generalising temporal 4) support for non-strict hierarchies, and dependencies for non-temporal dimensions. They 5) data with different levels of granularity should be generalise temporal functional dependencies into rollup allowed. dependencies and illustrate their usability in conceptual Explicit hierarchies in dimensions means that hierarchies modelling and data mining. The rollup dependencies can should be explicitly shown in the schema to help user's be used in modelling generalisation hierarchies if understanding of hierarchical structures in dimensions. hierarchical levels are ordered by the finer than relation. By multiple hierarchies in a dimension, the authors mean The rollup dependencies ensure comparing equality of that a dimension hierarchy is allowed to be an acyclic values in a specific level. In this way, they are a graph, that is, there can be several paths from the top generalisation of the functional dependencies. As an level to the most detailed level. For example, the time example, consider a relation on hotel room reservations T. Niemi, J. Nummenmaa, P. Thanisch 7-2 having attributes (ROOM_NUMBER, DATE, PRICE). relationship between just two columns, for example An example of a rollup dependency in this relation is Employee ID and Manager ID. In addition to this, a data ROOM_NUMBER DATEWEEK → PRICE, meaning that hierarchy can have additional attributes to indicate the the price of the room cannot change during a week. An level of the members, for example Job Grade for axiomatisation for rollup dependencies is also presented employees. This level does not necessarily correspond and negation with the rollup dependencies is studied. with the actual level of the member in the parent-child hierarchy. 3 OLAP Hierarchies Hierarchies can be classified according to their generality. Figure 1 presents a hierarchy of different data In this section we first formalise our model for the OLAP hierarchies. The most general is the acyclic digraph and cube and then study different kinds of data hierarchies the strictest is the balanced and non-ragged tree. and their properties. Our formalisation is based on relational database theory and we assume that the reader acyclic digraph knows the basics of it. 3.1 OLAP Model transitive anti-closed To be able to operate with dependencies, we must digraph formalise our notion of the OLAP cube. Generally, we assume that the cube consists of dimensions (measures can also be handled as dimensions). We use a relational tree approach [Cod70] that is close to the often-used star schema. However, we do not locate dimension hierar- chies and the fact table in different relations but assume balanced, non-ragged, that the fact table also contains the dimension hierarchies. ragged unbalanced It is important to notice that this is only an abstract model used as a base of the logical design and totally independ- ent of the implementation. Definition 1 The OLAP cube is a relation over the balanced, relation schema H = D1 ∪ D2 ∪ ...Dn, where each Dk is a non-ragged set of attributes and it is called a dimension schema. If D is a dimension schema, then a relation d over D is Figure 1: Hierarchy of dimension hierarchies called a dimension. The hierarchy representing the dimension is called the dimension hierarchy. The nodes The transitive anti-closed graph is the complement of the in the dimension schema are called attribute hierarchy transitive closure graph. The transitive closure can be levels. The tuples in the dimension relation are members constructed as follows. If there is a path from node A to of the attribute hierarchy levels. Instead, in data node B of length 2 or more, then add an edge from A to hierarchies the levels and members of the levels are B. Instead, the anti-closure can be produced as follows. If implied by the actual data. there is a path from node A to node B of length 2 or In what follows we mostly use the following example more, then remove an edge from A to B. of a dimension. These balance and raggedness properties require that the level code is attached to each node. A graph is ragged Example 2 We have a dimension of sales employees. The if each node does not have a level code corresponding to schema has three attributes: its actual level in the hierarchy, whereas a graph is E: Employee ID balanced if all leaf nodes have the same level code. M: Manager ID As Figure 1 indicates, we have six different kinds of G: Job grade hierarchies but two of them have similar properties Each tuple, t, in a relation over {E, M, G} contains details related to OLAP hierarchies: about an employee, namely t[E], the employee’s (1) Acyclic. This is the most general class. It allows the Employee ID, t[G], his or her Job grade, and t[M], the day-month-quarter-year; day-year structure, where Employee ID of his or her manager. the day-year could be called a "direct shortcut". A direct shortcut means redundant aggregation paths. 3.2 Data Hierarchies (2) Transitive anti-closed digraph. This is an acyclic graph with no direct shortcuts. No redundant aggre- In the attribute hierarchy, the levels correspond to gation paths are possible. attributes and the members of the level have the names of (3) Tree. Unique aggregation paths are guaranteed. the values that occur in the corresponding attribute's column. In a data hierarchy we have a parent-child T. Niemi, J. Nummenmaa, P. Thanisch 7-3 (4) Unbalanced but non-ragged or balanced but if there exists a path from t[X] to t’[Y], then a path from ragged. The available levels of aggregation are not t’[Y] to t[X] does not exist. equal. For simplicity we assume that the left part of the (5) Balanced and non-ragged. Equal levels of hierarchy cannot contain null values. We use the example aggregation are available. data presented in Table 1 to illustrate different classes of In the next section, we study dependencies that guarantee hierarchies. particular hierarchies and show examples about the hierarchies. Table 1: An example relation on management hierarchy 4 Dependencies for Hierarchies empID mgrID grade In this section we study relational dependencies in the 1 NULL A context of OLAP hierarchies. We mainly concentrate on 2 1 B dependencies for data hierarchies. We may consider two 3 1 B types of dependencies: 4 1 C (1) dependencies that must always hold because they 5 2 C say something about the semantics of the data, and 5 1 C (2) dependencies that might hold in the database at the point that we want to construct the parent/child The graph representing the relation of Table 1 is given in aggregation hierarchy, but which do not always Figure 2. We use the relation of Table 1 as a running hold. example. Typically, stricter hierarchies imply deletion of For instance, it may be that at some moment each some rows from the relation. employee has a unique manager. This might be because there is a dependency, which always holds, or, alterna- tively, it just happens to be so at that moment but 1 generally it is possible that an employee has several managers. 3 2 We note that, in practice, a number of other constraints on the data are needed. For example, it may be 5 4 that the data hierarchy should have a unique root element for aggregation purposes. We ignore these considerations in the present paper Figure 2: Directed acyclic graph representing the relation in Table 1 4.1 Data Hierarchy 4.2 Anti-Closure Dependency For data hierarchies, we need the inclusion dependencies, but only in a simplified form where only a single relation Unbalanced hierarchies often exist in applications and the is considered. vendors of OLAP products have started support them. For example, in the latest release of Microsoft Analysis Definition 2 Let r be a relation over R and X, Y ⊆ R. Services [Mic00], it is possible to define an "unbalanced" There is an inclusion dependency [X] ⊆ [Y] in r, if X(r) tree. This allows us to have an EMPLOYEES relation ⊆ Y(r). which includes the columns EmpID and MgrID, where For the data hierarchy we first define a path in a relation. (as usual) the MgrID entry in a row identifies the employee's manager. Suppose that all employees are Definition 3 Let r be a relation over R and X, Y ⊆ R. A Salesmen or their managers. We want to aggregate sales sequence of tuples t0, t1, ..., tn∈r forms a path from t0[X] by manager, by aggregating the sales of all salesmen to tn[Y] with length n, if t0[Y] = t1[X], t1[Y] = t2[X], ..., below a manager. This is tricky because the leaf nodes tn-1[Y] = tn[X]. can have different depths. For example, in Table 1 (see also Figure 2) leaf nodes ‘3’ and ‘4’ have different levels We write for a path from the tuple t0 to the in the tree. tuple tn. A relation may contain null values but a null In a different context, Gottlob et al. defined the value cannot exist in a valid path. If the designer is closure dependency [Got89]: considering the use of some attributes as a data hierarchy, the conditions described in the following definition must Definition 5 Let r be a relation over R and X, Y ⊆ R. The be met. relation r obeys the closure dependency X@Y if for each pair of tuples t and t’ in r it holds: if t[Y] = t’[X], then Definition 4 A relation h over the relation schema H(XY) there exists a tuple t’’ such that t’’[X] = t[X] and forms a data hierarchy if [Y] ⊆ [X] and for each t, t’ ∈ h t’’[Y]=t’[Y]. T. Niemi, J. Nummenmaa, P. Thanisch 7-4 With regard to OLAP hierarchies, the complement of the year closure dependency, which we refer to as the anti-closure dependency, is of relevance. We need the anti-closure dependency in order to quarter week prevent a hierarchy in which a rollup path “by-passes” intermediate nodes. The transitive anti-closure of the hierarchy in Figure 2 is given in Figure 3. There are two month rollup paths from ‘4’ to ‘1’ in Figure 2. We want to eliminate the direct path, so that rollup never by-passes ‘2’. day 1 Figure 5: A dimension schema hierarchy that obeys ACD 3 2 Multiple paths between nodes are allowed but if there is a longer path between two nodes, the direct path is not 5 4 allowed. Or, the same can be said in the opposite way: if there is a direct path between two nodes, the longer path Figure 3: A transitive anti-closed graph cannot exist. The hierarchy in Figure 4 does not obey the ACD while the hierarchy in Figure 5 obeys it. For the hierarchy of Example 2, we want to ensure Theorem 1 Let r be a relation over R and X, Y, Z, W ⊆ that the transitive anti-closure of the graph, i.e. if t[M] = R. If X #> Y, then XZ #> YW. t’[E], there should not be some longer “path” from t’ to t using E and M values to connect tuples. This can be done Proof. Let t, t’∈r. Since X #> Y, there is not a direct and by the anti-closure dependency. It guarantees that the indirect path from t[X] to t’[Y]. Adding more attributes corresponding hierarchy graph is transitively anti-closed, cannot increase the number of matching values, thus there but this does not disallow the possibility that a node can cannot exist more paths from t[XZ] to t’[YW] than from have multiple parents, i.e. an employee reports to more t[X] to t’[Y]. than one manager. The anti-closure dependency is not transitive. A relation Definition 6 Let r be a relation over R and X, Y ⊆ R. The presented in Table 2 obeys dependencies A#>B and relation r obeys the anti-closure dependency (ACD) B#>C but not A#>C. X#>Y, if for any path , n ≥ 2, there does not exist a direct path in r such that t1 [X] = v1[X] Table 2: Anti-closure is not transitive and tn[Y] = vm[Y]. A B C 1 4 2 year 2 5 3 1 6 3 quarter 4.3 Functional Dependencies Although more general rollup hierarchies are useful, often month stricter hierarchies, such as trees, are needed. The well-known functional dependencies [Cod72] can be used to force a hierarchy to a tree. In Figure 3 there is also a day functional dependency between nodes. The functional dependencies can be used in both attribute and data Figure 4: A dimension schema hierarchy violating hierarchies. ACD In data hierarchies, we may want to restrict our attention to Parent/Child dimensions in which each node in the dimension has just one parent. In Example 2, we can enforce this rule with the functional dependency E M. The functional dependency E  * PXVW DOVR KROG LQ our example. T. Niemi, J. Nummenmaa, P. Thanisch 7-5 Definition 7 A relation r over R obeys a functional this a subtler dependency than a conventional FD is dependency X→Y, if for each pair of tuples t, t’ ∈ r, needed as Example 3 shows. whenever t[X] = t’[X], then t[Y] = t’[Y]. Example 3 We have the following rule: OLAP hierarchies often contain non-applicable If two rows agree on City and Neighbourhood then NULLs. With regard to functional dependencies, suppose If State is NULL in both rows then two of the columns in the relation are ProductID (which They must disagree on Country will be used as a member in the cube) and Colour (which else will be incorporated as a member property). ProductId → Either One has a NULL in State Colour holds. We assume that two products are given Or they are both non-NULL in State and the State different ProductIds if their colours differ, even if they entries are different. are identical in every other respect. However, it could be that some products do not have a colour, so in a relation, The Boolean dependency corresponding to the example the entry in the Colour column for such products will be would be: City AND Neighbourhood → State OR NULL. The functional dependency means that two rows Country. Next we give a more formal definition for that agree on ProductId cannot disagree on Colour. In Boolean dependencies. particular, if one such row has a non-NULL entry in the Definition 8 Let R be a relation schema. The following Colour column, the other row must have an identical expressions are valid Boolean dependencies in R: entry. 1. A, if A∈R, The functional dependency ProductId → Colour must 1. !A, if A is a valid expression, hold before we can consider including the Colour column 2. (A), if A is a valid expression, and as a member property of ProductId. If we want to add Colour as a level above the ProductId level, we would 3. A o B, if A and B are valid expressions and o∈{AND, need to include a special member in the Colour level for OR, XOR, →, ↔}. products that have NULL entries in the Colour column. Definition 9 Let r be a relation over R. The relation r Lemma 1 Let r be a relation over R and X, Y ⊆ R. If the obeys a Boolean dependency X, if r has only one tuple or for each pair of distinct tuples t1 and t2 in r obeys the functional dependency X→Y holds in r, then for all two dependency X’ that is constructed from X by replacing tuples t, t’∈ r there is at most one path between t[X] and each term A in X to t1[A] = t2[A]. t’[Y]. In Definition 9, we assume for the negation that !(t1[A] = Proof. (By contradiction) Assume there exist two paths t2[A]) ⇔ t1[A] ≠ t2[A]. The tuples t1 and t2 have to be from t[X] to t’[Y]. Then there must exist an X value distinct because of problems with negation: the related to two different Y values, that is, the functional dependency does not hold. dependency A → !B would be never valid if t1 = t2. Although a Boolean dependency can be an arbitrary Theorem 2 Let r be a relation over R and X, Y ⊆ R. If Boolean expression, we restrict our study to ‘implication’ X→Y, then X #> Y. dependencies. Example 4 illustrates such a Boolean dependency. Proof. Let t, t’∈r. According to Lemma 1 there can be at most one path between t[X] and t’[Y]. Thus, X#>Y holds. Example 4 Table 3 obeys a Boolean dependency A→B OR C, but it does not obey A→B OR A→C. 4.4 Boolean Dependencies Table 3: A→B OR C The following example illustrates the use of the Boolean dependencies [KRM99]. Consider the Geography dimension: Country – State – City – Neighbourhood. A B C With ragged hierarchies, we allow NULLs in the column a1 b1 c1 corresponding to potentially-missing parents (e.g., the a1 b1 c2 State column for the row which has "Washington DC" in a1 b2 c2 its City column). But we could insist that if we allow such raggedness, there must be a "partial" functional 4.5 Enforcing Non-Raggedness dependency in force which means that we must restrict In our example non-raggedness means that all employees the numbers of cities called "Washington DC" to one per reporting to the same manager have the same job grade. country. There are several different cities in the USA We use the terms of graph theory. If t[X, Y] ∈ h, then called "Springfield", but that is acceptable, because each t[X] is called an underling of t[Y]. Springfield is located in a different State. If there were two "Washington DC"s, it would not be a problem, so Definition 10 Let h be a data hierarchy over a relation long as at least one is in a State. But if neither Washing- schema H and X, Y ⊆ H. The set of direct underlings of ton DC is in a State, then we have ambiguity. To forbid t[Y] is Underlings(t) = {t’ ∈ h | t’[Y] = t[X]}. T. Niemi, J. Nummenmaa, P. Thanisch 7-6 Definition 11 Let r be a relation over R and X, Y, W ⊆ an unbalanced tree can be seen in Figure 7. We define the R. The relation r obeys a non-raggedness dependency set of the leaf nodes of a hierarchy as follows: (NRD) XY →NRD : LI (XY)r is a data hierarchy and for Definition 12 Let h be a data hierarchy H. The set of the all tuples t ∈ r, for all tuples t’, t” ∈ Underlings(t), t’[W] leaf nodes of h is Leaves(h) = {t ∈ h | Underlings(t) is = t”[W]. empty}. In the definition, the role of W is to be a level identifier. Definition 13 Let r be a relation over R and X, Y, W ⊆ There can be two types of hierarchies according to the R. The relation r obeys a balance dependency (BD) values of the level identifier: 1) strict and 2) monotonic. XY→BD:LI (XY)r is a data hierarchy and for all tuples t, Strict means that the child must have a different and strictly lower level than the parent, whereas monotonic t’∈ Leaves(h), t[W] = t’[W]. means that the child must not have a higher level value than its parent has. Moreover, we say that a level 1 identifier is consistent if there is a one-to-one mapping between the values of the level identifier and the actual levels of the nodes. Consistent level identifiers and 3 2 non-raggedness dependency guarantee that no levels can be by-passed in rollup paths. Figure 8: A non-ragged and balanced tree 1 Non-raggedness with consistent level identifiers and balance guarantee that aggregations are complete on every level in the hierarchy. In the balanced hierarchy all 2 leaf nodes are on the same level, and in the non-ragged hierarchy with consistent level identifiers there must exist 5 4 a node on every hierarchy level in each rollup path. An example about non-ragged and balanced hierarchy can be seen in Figure 8. Figure 6: Ragged but balanced tree Figure 6 shows that the completeness condition of 5 Conclusions aggregations can easily be violated in ragged hierarchies. We have studied different aggregation hierarchies and For example, if we summarize the members of the second dependencies that hold in a particular hierarchy. We have level, we lost the value of node ‘5’, because it does not also classified the hierarchies: the most general is the rollup to any member on the second level. However, the acyclic digraph and the strictest is the balanced and non-raggedness with consistent level identifiers is not a non-ragged tree. The structure of the hierarchy has sufficient condition for complete aggregations. For significant effect on correctness of aggregations and the example, the hierarchy in Figure 7 is non-ragged but storage space needed to store pre-calculated aggregation unbalanced, i.e. the leaf nodes are not on the same level. values. In general it can be said that a stricter hierarchy Now, if we summarize the members of the lowest level, makes it easier to get correct aggregations. When a data we will lose the member ‘3’. hierarchy is balanced and non-ragged with consistent level identifiers, aggregations are complete on every level 1 in the hierarchy. On the other hand, a hierarchy, which is too strict, reduces the number of possible rollup paths and thus decreases the expressive power of the OLAP cube. 3 2 6 References 4 [Cod70] Codd, E. F.: A relational model for large shared data banks, Communications of the ACM, Figure 7: Non-ragged but unbalanced tree 13(6), 1970. [Cod72] Codd, E. F.: Further normalization of the data base relational model, Data Base Systems, Courant 4.6 Enforcing Balance Computer Science Symposia Series 6, R. Rustin (ed.), In a balanced hierarchy, all employees who are at the Prentice-Hall, New Jersey, 1972. leaves of the organisation hierarchy have the same job [Got89] Gottlob, G., Schrefl, M., and Stumptner, M.: grade. A hierarchy in Figure 6 is balanced, since all leaf On the interaction between transitive closure and nodes, ‘5’ and ‘4’ are on the same level. An example of functional dependencies, MFDBS 89, the 2nd Sym- posium on Mathematical Fundamentals of Database T. Niemi, J. Nummenmaa, P. Thanisch 7-7 Systems, J. Demetrovics et al. (eds), LNCS, Vol. 364, Springer, 1989. [HMV99] Hurtado, A., Mendelson, A., and Vaisman A.: Maintaining data cubes under dimension updates, Proceedings of the 15th International Conference on Data Engineering, IEEE Computer Society Press, 1999. [JLS99] Jagadish, H., Lakshmanan, L., and Srivastava D.: What can hierarchies do for data warehouses?, Proceedings of the 25th International Conference on Very Large Data Bases, M. Atkinson et al. (eds), Morgan Kaufmann, 1999. [KMR99] Khardon, R., Mannila, H., Roth, D.: Reasoning with examples: propositional formulae and database dependencies, Acta Informatica, 36(4): 267-286, 1999. [LAW98] Lehner, W., Albrecht, J., and Wedekind, H.: Normal forms for multidimensional databases, Pro- ceedings of the 10th International Conference on Scientific and Statistical Data Management (SSDBM’98), 1998. [LeSh97] Lenz, H.-J. and Shoshani, A.: Summarizability in OLAP and statistical data bases, Ninth Interna- tional Conference On Scientific And Statistical Database Management (SSDBM), 1997. [Mic00] Jacobson, R.: Microsoft SQL Server 2000 Analysis Services Step By Step, Microsoft Press, 2000. [PeJe99] Pedersen, T. and Jensen, C.: Multidimensional data modeling for complex data, Proceedings of the International Conference on Data Engineering (ICDE’99), 1999. [WNC99] Wijsen, J., Ng, R., and Calders, T.: Discover- ing roll-up dependencies, The Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 1999, ACM, 1999. [ZuSi99] Zurek, T. and Sinnwell, M.: Data warehousing has more colours than just black & white, VLDB’99, Proceedings of 25th International Conference on Very Large Data Bases, M. Atkinson et al. (eds.), Morgan Kaufmann, 1999. T. Niemi, J. Nummenmaa, P. Thanisch 7-8