=Paper= {{Paper |id=Vol-1/paper-3 |storemode=property |title=Formalization of OODB models |pdfUrl=https://ceur-ws.org/Vol-1/vossen-long.pdf |volume=Vol-1 |authors=G. Vossen }} ==Formalization of OODB models== https://ceur-ws.org/Vol-1/vossen-long.pdf
                            Formalization of OODB Models
                                            Gottfried Vossen
                      Institut fur Wirtschaftsinformatik, Universitat Munster
                                  Grevenerstrae 91, 48159 Munster


1 Introduction                                                             Class P
Object-oriented data models represent a current end-                                PPPmessg
point in the evolution of data models [23]. Their for-
                                                                  type                    PPPP
malization has been attempted in a variety of papers,           Type
                                                                                                 Pq
                                                                                   Methods  impl Messages
including [5; 6; 19]. This short paper indicates what
                                                                                         6

                                                             '                                   $
we consider the common intersection of these (and
other) approaches; we list the relevant features and              dom
components, and give an idea of how to formalize
the notion of an object-oriented database schema.                ?            inst
   An object-oriented data model has to capture a               Values
variety of requirements [8; 27], which di er consid-             6            ?


                                                             &                                   %
erably from those that traditional data models have               val      Object
to meet. However, many system developers seem
not to care about formal models as a solid foun-                               @
dation of their system, but simply design a \data                             ? @@R ?
de nition language" in which the relevant features             State         oid      Behavior
can be coded. In our opinion, a formal model for
object-oriented databases basically has to capture
the same intuitions as models for other types of da-
tabases, which are the following:                             Figure 1: Core Aspects of an Object Model.
  1. It has to provide an adequate linguistic abstrac-
     tion for certain database applications.              Sbehav ); in what follows, we rst consider each com-
  2. It should provide a precise semantics for a data     ponent in isolation and then indicate how the two in-
     de nition language.                                  teract. We mention, however, that while it is gener-
  3. It has to be composed of both a speci cation         ally agreed that an object-oriented data model has to
     and an operational part.                             capture both structure and behavior, the former can
                                                          be obtained by using the experience from the rela-
  4. It represents a computational paradigm as a ba-      tional, nested relational and complex-object models,
     sis for formal investigations.                       but the latter represents a completely new challenge
In this short note, we do not present a comprehensive     to database researchers. Consequently, a consensus
survey of formal models for object-oriented databa-       seems achieved for structure, but not for behavior.
ses which have been proposed in the literature, but          The core aspects of formal models for object-
instead try to point out the fundamentals of how          oriented databases are summarized in Figure 1, in
such models are obtained. The result can be con-          which labels of arrows represent function names. In
sidered as a framework in which the essentials of         brief, the only structuring mechanism is the class
the object-oriented paradigm can be expressed con-        which describes both structure and behavior for its
cisely and further studied. Indeed, we give hints to      instances, the objects. Structure is captured as a
various such investigations that have recently been       type for a class (in our notation, a function \type"
undertaken.                                               associates a type with each class; the other function
                                                          names shown above are to be interpreted similarly,
2 Core Aspects of Formal OO                               see below). A type is nothing but a description of
                                                          a domain, i.e., a set of values, and may or may not
  Models                                                  be named (in the former case, type names distinct
In this section, we describe what we perceive as the      from class names and attribute names must be pro-
core aspects of various proposals for object mod-         vided). Values comprise the state of an object and
els, and we do so by distinguishing structural from       can be as complex as the type system allows (i.e.,
behavioral aspects. Thus, we generally consider           depending on the availability of base types and con-
schemas, the central notion of any conceptual data-       structors like tuple, set, bag, list, etc.). Behavior is
base description, to be pairs of the form S = (Sstruc ,   manifested in a set of messages associated with each
class (its external interface), which are internally im-      (v) C  T, where C is a nite set of class names.
plemented using methods that are executable on ob-           This states nothing but the fact that class names
jects. Hence, objects have a state and a behavior; in        are allowed as types (below we will complement this
addition, they are uniquely identi ed. Messages are          with the requirement that classes themselves have
speci ed by providing a signature, and by associating        types).
several signatures with the same message name, the               The intuition behind this new condition is that ob-
latter gets overloaded. Not shown in Figure 1 is the         jects from the underlying application all are distin-
possibility to organize classes in an inheritance hier-      guished by their identity, get collected into classes,
archy; also not shown is the fact that class attributes      and can reference other objects (share subobjects).
are allowed to reference other classes, thereby form-        To provide for this at the level of domains, let us
ing an aggregation lattice.                                    rst assume the availability of a nite set OID of
    We next look at structural as well as behavioral         object identi ers which includes the special identi-
aspects in more detail. Regarding the modeling of              er nil (to capture \empty" references); next, ob-
structure, more precisely highly-structured informa-         ject domains, i.e., sets of possible values for objects
tion, complex data types are all that is basically           are complex values as above with the following ad-
needed, since they serve as descriptions for domains         ditional condition:
of complex values. One way to introduce such types,
i.e., to de ne a type system T, is the following:             (e) dom(c) = OID for each c 2 C.
  (i) integer, string, float, boolean  T;                   Thus, classes are assumed to be instantiated by ob-
 (ii) if Ai are distinct attributes and ti 2 T, 1  i       jects (class-name types take object identi ers as val-
      n; then                                                ues, in the same way as, say, the integer type takes
      [A1 : t1 ; : : :; An : tn ] 2 T (\tuple type");        integer numbers as values). Clearly, this alone is not
                                                             enough, since class instances commonlyhave distinct
(iii) if t 2 T; then ftg 2 T (\set type");                   sets of object identi ers associated with them. We
(iv) if t 2 T; then < t >2 T (\list type").                  will show below how that (and, for example, the fact
In other words, a type system is made up of base             that sometimes inclusion dependencies need to hold
types, from which complex types may be derived us-           between sets of class instances) is captured at the
ing (eventually attributes and) constructors. Note           instance level.
that this requires nothing additional but the avail-             The object-oriented paradigm has another dimen-
ability of attribute names. Clearly, other base              sion for organizing information besides aggregation,
types as well as additional or alternative construc-         which is inheritance, or the possibility to de ne a
tors could straightforwardly be included. Notice also        class as a specialization of one or more other classes.
that here types are not named; for practical reasons,        To this end, a subtyping relation is needed through
the use of type names may be desirable (e.g., in order       which it can be expressed that a subclass inherits
to be able to reuse type de nitions in various places        the structure of a superclass. Such a relation can be
throughout a schema), and if it is, it can easily be         de ned in various ways; for example, it can be de-
added to the above in the way indicated earlier.               ned semantically by requiring that the sets of values
    The notion of a domain as a \reservoir" of possible      or instances of types, where one is a subtype of the
values can be de ned as follows; it just has to obey         other, are in a subset relationship. We prefer a sim-
constructor applications:                                    pler, syntactical approach, which has, for example,
                                                             the advantage that checking subtype relationships
 (a) dom(integer) is the set of all integers; dom            can be automated:
      is analogously de ned for string, float,                   Let T be a set of object types. A subtyping rela-
      boolean;                                               tion   T  T is de ned as follows:
 (b) dom([A1 : t1; : : :; An : tn]) :=                         (i) t  t for each t 2 T,
      f[A1 : v1 ; : : :; An : vn ] j (8 i; 1  i  n) vi 2
      dom(ti)g;                                               (ii) [A1 : t1; : : :; An : tn]  [A1 : t1 ; : : :; Am : tm ]
                                                                                                     0   0       0     0



 (c) dom(ftg) :=                                                   if
      ffv1 ; : : :; vng j (8 i; 1  i  n) vi 2 dom(t)g;            (a) (8 Aj ; 1  j  m)(9 Ai ; 1  i  n) Ai =
                                                                                 0



 (d) dom(< t >) :=                                                      Aj ^ ti  tj ,
                                                                         0               0


      f< v1 ; : : :; vn > j (8 i; 1  i  n) vi 2                   (b) n  m,
      dom(t)g.                                               (iii) ftg  ft g if t  t ,
                                                                             0               0

In a structurally object-oriented context, the rst           (iv) < t >  < t > if t  t .
thing that needs to be introduced beyond complex
                                                                                     0           0


types and domains as de ned above is the possibil-           With these preparations, we arrive at the follow-
ity to share pieces of information between distinct          ing de nition for objectbase schemas that can de-
types, or to aggregate objects from simpler ones. At         scribe structure of arbitrary complexity: A struc-
the level of type declarations, an easy way to model         tural schema is a named quadruple of the form
this is the introduction of another reservoir of names,      Sstruc = (C; T; type; isa) where
this time called class names, which are additionally           (i) C is a ( nite) set of class names,
allowed as types. In other words, object types are
complex types as above with the following new con-            (ii) T is a ( nite) set of types which uses as class
dition:                                                            names only elements from C,
(iii) type : C ! T is a total function associating a               In combining structural and behavioral schemas,
      type with each class name,                               we nally obtain an objectbase schema of the form
(iv) isa  C  C is a partial order on C which is               S = (C; (T; type; isa;); (M; P; isa; messg; impl)).
      consistent w.r.t. subtyping, i.e.,                       S is called consistent if the following conditions are
      c isa c ) type(c)  type(c ) for all c; c 2 C.
            0                       0             0
                                                               satis ed:
    This de nition resembles what can be found in                (i) c isa c implies messg(c )  messg(c) for all
a variety of models proposed in the literature, in-
                                                                                 0                           0



cluding [17; 19; 20; 25] and others. Notice that it                   c; c 2 C,
                                                                             0



still leaves several aspects open, like single vs. mul-         (ii) if c isa c and s; s 2 sign(m) for m 2 M such
                                                                                     0           0


tiple inheritance; if the latter is desired, a condition              that s : ct1 : : :tn ! t, s : c t1 : : :tn !
                                                                                                                     0       0   0               0


needs to be added stating how to con icts should                     0
                                                                      t , then ti  ti for each i, 1  i  n, and t  t ,
                                                                                             0                                                       0


be resolved. Also, implementations typically add               (iii) for each m 2 messg(c) there exists a c 2 C)                     0

a number of additional features, like attributes as                   s.t. c isa c and impl(m; c ) is de ned.
                                                                                         0                       0

functions [22; 29], a distinction of class attributes          Condition (i) just says that subclasses inherit the be-
from instance attributes (the latter are shared by             havior of their superclasses. Condition (ii) says that
all objects associated with a class, while the for-            message-name overloading is done with compatible
mer represent, for example, aggregate information              signatures, and is called the covariance condition in
like an average salary only relevant to the class as a         [20; 9]. The covariance condition is a signi cant dif-
whole) [7], a unique root of the class hierarchy from          ference from what is used at a corresponding point in
which every class inherits [20], a distinction between         programming languages, and which is known as the
private and public attributes [12], a di erent set of          contravariance condition; for a detailed explanation,
constructors (like one with an additional array con-           see [9]. Finally, Condition (iii) states that for each
structor to describe matrices), an explicit inclusion          message associated with a class, its implementation
of distinct types of relationships between classes and         must at least be available in some superclass.
their objects (in particular various forms of composi-             It is interesting to note that various natural con-
tion, see [18]), integrity constraints which represent         ditions can be imposed on the programs that are
semantic information on the set of valid databases             used as implementations of messages. We now sketch
instances (a proposal in that direction appears in [3;         one of them, which is based on the view that pro-
4], where object constraints, class constraints, and           grams are functions on domains [20]. More formally,
database constraints are distinguished). For another           if m 2 M and s : c  t1  : : :  tn ! t 2 sign(m),
example, the ODMG-93 proposal for a standardized               then impl(m; c), if de ned, is a program p 2 P of
model [10] contains explicit keys, (binary) relation-          the form
ships, and inverse attributes. None of these features
appear in our model, the reason being that these are            p : dom(c)  dom(t1)  : : :  dom(tn) ! dom(t)
not speci c to object-orientation.                             The condition in question informally states that if
    The second important aspect of an object-oriented          message overloading appears in isa-related classes
database is that it is intended to capture behavior,           (so that the corresponding signatures satisfy the co-
besides structure. To this end, the relevant intu-             variance condition), then the associated programs
ition is that classes have attached to them a set of           coincide (as functions) on the subclass. More for-
messages, which are speci ed in the schema via sig-            mally, we have: If jsign(m)j > 1 for some m 2 M,
natures, and which are implemented as methods. In              then the following holds: If s; s 2 sign(m) such that
                                                                                                                 0



addition, behavior can be inherited by subclasses,             s : c  t1  : : :  tn ! t, s : c  t1  : : :  tn ! t ,
                                                                                                         0   0           0               0           0



and message names can be overloaded, i.e., re-used             c isa c , ti  ti for each i, 1  i  n, t  t , and
                                                                         0               0                                                   0



in various contexts.                                           impl(m; c) = p, impl(m; c ) = p , then p and p agree
                                                                                                     0           0                           0



    So a behavioral schema is a named ve-tuple of              on dom(c)  dom(t1)  : : :  dom(tn).
the form Sbehav = (C; M; P; messg; impl) where                     A variety of formal investigations for behavioral
  (i) C is a ( nite) set of class names as above (again        schemata in the sense de ned above can already
      needed here since references to it have to be            be found in the literature, which investigate ques-
      made),                                                   tions including termination of method executions,
 (ii) M is a ( nite) set of message names, where each          limited depth of method-call nestings (an issue re-
      m 2 M has associated with it a nonempty set              lated to precompilation of method executions), well-
      sign(m) = fs1 ; : : :sl g, l  1, of signatures; each    de nedness of method calls, i.e., consistency as well
      sh , 1  h  l, has the form sh : c  t1  : : : tp !   as reachability considerations (issues related to type
      t for c 2 C, t1 ; : : :; tp ; t 2 T                      inference and schema evolution), expressiveness of
      (each signature has the receiver of the message          method implementation languages (relative to some
      as its rst component),                                   notion of completeness), complexity of method exe-
(iii) P is a ( nite) set of methods or programs,               cutions, or potential parallelism of method evalua-
                                                               tions. To investigate such issues, our general notion
(iv) messg : C ! 2M is a mapping s.t. for each                 of schema is made precise in various ways. For ex-
      c 2 C and for each m 2 messg(c) there exists             ample, [15] xes a simple imperative language for
      a signature s 2 sign(m) satisfying s[1] = c,             implementing methods as retrieval programs, con-
 (v) impl: f(m; c) j m 2 messg (c)g ! P is a                   trasts them with update programs and shows un-
      partial function.                                        decidability results for the latter. [1; 2] as well as
[11] introduce distinct notions of a method schema            persons living in a large country are repre-
to study behavioral issues of OODBS; for example,             sented. In this context, so many combi-
[2] investigates implications of the covariance condi-        nations of meaningful properties have to be
tion using the formalism of program schemas, while            distinguished that it might become necessary
[11] looks at tractability guarantees corresponding           to introduce arti cial name constructions for
to those known for relational query languages. Also,          classes, like unmarried-nonstudent-autoOwner-
it is pretty straightforward to de ne an object alge-         renter-taxpayer [26], and each such class has
bra for a model like the one sketched in the previous         only very few instances. More generally, the
section; see, for example, the papers in [13]. That           name space available for classes might not be
carries over to issues like query optimization, imple-        sucient.
mentation of operations, and query processing. A           5. Objects and their classes might come into ex-
survey of other recent investigations that have simi-         istence in reverse order. A database user in
lar bases or origins can be found in [28].                    a  design environment like CAD creates objects
    We emphasize again that the model just sketched           in  the rst place, not type de nitions or even
can be seen as description of the core of vastly any          classes.  The usage of databases thus di ers con-
object-oriented model; however, this is valid only rel-       siderably    from traditional applications where
ative to the fact that many specialities, which have          schema    design  has to be completed prior to in-
been proposed in the literature, or which are being           stance  creation.
built into commercial systems, are neglected here.
    We conclude this section with a brief indication of We mention that one issue or the other from this
how object databases, i.e., sets of class instances or list is sometimes re ected already in existing mod-
extensions, can be de ned over a given schema: For els, but never as a basic design target. Alternative
a given objectbase schema S, an objectbase over S is approaches, which takes these issues into considera-
a triple d(S) = (O, inst, val) s.t.                      tion right from the start, appear, for example, in [21;
  (i) O  OID is a nite set of object identi ers,        24; 16]. A possible general concept for the solution
                                                         of these problems seems the exploitation of proto-
 (ii) inst: C ! 2O is a total function satisfying the type languages, which suggest to model applications
      following conditions:                              without a classi cation that partitions the world into
       (a) if c; c 2 C are not (direct or indirect) sub- entity sets. A prototype represents default behavior
               0


            classes of each other,                       for some concept, and new objects can re-use part
            then inst(c) \ inst(c ) = ;,
                              0                          of the knowledge stored in a prototype by saying
       (b) if c isa c , then inst(c)  inst(c ),
                   0                     0               how they di er from it. Upon receiving a message
(iii) val: O            !      V is a function s.t. an       object does not understand, it can forward (del-
      (8 c 2 C) (8 o 2 inst(c)) val(o) 2 dom(type(c)). havior.itIntothe
                                                         egate)        its prototype to invoke more general be-
                                                                           area of object-oriented programming
Notice that this de nition closes the problem left languages, many people believe that this approach
open earlier, namely that class domains originally has advantages over the class-based one with inher-
were simply the set OID.                                 itance, with respect to the representation of default
                                                         knowledge and incrementally and dynamically modi-
3 Open Issues                                            fying concepts. The investigation of classless models
                                                         in the context of object-oriented databases has only
We next survey several modeling issues in object- recently         been proposed in [26], and a concrete model
oriented databases which have not yet received is reported            in [14].
enough research attention:
   1. Entities can have roles that vary over time. For 4 Conclusions
      example, some person object may at one point
      be a student, at another an employee, and at a In this short paper we have tried to give a rough
      third a club member; while the person's identity personal account of recent work on formal models
      never changes, its type changes several times.     for object-oriented databases. Although there is not
   2. Entities can have multiple types at the same a single uniform such model, the foundations on
      time. For example, a person may be a stu- which such models have to be built seem understood,
      dent, an employee, and a club member simul- and even standardization             e orts have recently been
      taneously. So far the only way to represent this launched [10]. On the other hand, a number of in-
      in an object-oriented database is by multiple teresting research issues still deserve further investi-
      inheritance, but this might not be appropriate gation. In particular, formal models as they are cur-
      since it can result in a combinatorial explosion rently available seem hardly suited for the nonstan-
      of sparsely populated classes [21].                dard applications which initiated the consideration
                                                         of object-orientation in the context of databases. A
   3. Objects can be in various stages of development. reason seems to be that many researchers have too
      For example, in a design environment it is usu- much of a relational background, and try to exploit
      ally necessary to maintain incomplete designs, that as long as possible; this is more than con rmed
      i.e., objects whose types get completed in the by the ODMG-93 proposal. As was done a number
      course of time.                                    of years ago, when database people discovered what
   4. Classes may contain \too few" instances. For programming-language or knowledge-representation
      example, consider a database in which all people had been studying for years already, it seems
again necessary to take recent developments in these    [17] A. Kemper et al.: GOM: A Strongly Typed Per-
areas into account, and to adopt them for solving the        sistent Object Model with Polymorphism; Proc.
problems database applications have.                         German GI Conference on \Datenbanken fur
                                                             Buro, Technik und Wissenschaft" (BTW) 1991,
References                                                   Springer Informatik-Fachbericht 270, 198{217
[1] S. Abiteboul, P.C. Kanellakis: The Two Facets       [18] W. Kim: Introduction to Object-Oriented Data-
    of Object-Oriented Data Models; IEEE Data                bases ; MIT Press 1990
    Engineering Bulletin 14 (2) 1991, 3{7               [19] C. Lecluse et al.: O2, an Object-Oriented
[2] S. Abiteboul, P.C. Kanellakis, E. Waller:                Data Model; Proc. ACM SIGMOD Interna-
    Method Schemas; Proc. 9th ACM Symposium                  tional Conference on Management of Data 1988,
    on Principles of Database Systems 1990, 16{27            424{433
[3] P.M.G. Apers et al.: Inheritance in an Object-      [20] C. Lecluse, P. Richard: Foundations of the O2
    Oriented Data Model; Memoranda Informatica               Database System; IEEE Data Engineering Bul-
    90-77, University of Twente 1990                         letin 14 (2) 1991, 28{32
[4] H. Balsters et al.: Sets and Constraints in an      [21] J. Richardson, P. Schwarz: Aspects: Extend-
    Object-Oriented Data Model; Memoranda In-                ing Objects to Support Multiple, Independent
    formatica 90-75, University of Twente 1990               Roles; Proc. ACM SIGMOD International Con-
[5] F. Bancilhon, C. Delobel, P. Kanellakis (eds.):          ference on Management of Data 1991, 298{307
    Building an Object-Oriented Database System         [22] M.H. Scholl, H.J. Schek: A Relational Object
    | The Story of O2. Morgan-Kaufmann 1992                  Model; Proc. 3rd International Conference on
[6] C. Beeri: A Formal Approach to Object-                   Database Theory 1990, Springer LNCS 470, 89{
     Oriented Databases; Data & Knowledge Engi-              105
     neering 5, 1990, 353{382                           [23] H.J. Schek, M.H. Scholl: Evolution of Data
[7] E. Bertino et al.: An Object-Oriented Data               Models; Proc. Database Systems of the 90s,
     Model for Distributed Oce Applications;                November 1990, Springer LNCS 466, 135{153
     Proc. ACM Conference on Oce Information           [24] E. Sciore: Object Specialization; ACM Transac-
     Systems 1990, 216{226                                   tions on Information Systems 7, 1989, 103{122
[8] E. Bertino, L. Martino: Object-oriented Data-       [25] D.D. Straube, M.T. O zsu: Queries and Query
     base Management Systems: Concepts and Is-               Processing in Object-Oriented Database Sys-
     sues; IEEE Computer 24 (4) 1991, 33{47                  tems; ACM Transactions on Information Sys-
[9] E. Bertino, L. Martino: Object-Oriented Data-            tems 8, 1990, 387{430
     base Systems; Addison-Wesley 1993                  [26] J.D. Ullman: A Comparison of Deductive and
[10] R.G.G. Cattell (ed.): The Object Database               Object-Oriented Database Systems; Proc. 2nd
     Standard: ODMG-93. Morgan-Kaufmann 1994                 DOOD Conference, Springer LNCS 566, 1991,
[11] K. Denningho , V. Vianu: The Power of Meth-             263{277
     ods with Parallel Semantics; UCSD Technical        [27] G. Vossen: Datenmodelle, Datenbanksprachen
     Report No. CS91{184, University of California,          und Datenbankmanagement-Systeme; 2. Au-
     San Diego, February 1991; extended abstract in            age, Addison-Wesley 1994
     Proc. 17th Int. Conference on Very Large Data
     Bases 1991, 221{232                                [28] G. Vossen: Database Theory: An Introduction;
[12] O. Deux et al.: The Story of O2; IEEE Trans-            Technical Report, University of Munster, June
     actions on Knowledge and Data Engineering 2,            1994
     1990, 91{108                                       [29] K. Wilkinson et al.: The Iris Architecture and
[13] J.C. Freytag, D. Maier, G. Vossen: Query                Implementation; IEEE Transactions on Knowl-
     Processing for Advanced Database Systems;               edge and Data Engineering 2, 1990, 63{75
     Morgan-Kaufmann 1994
[14] M. Gro-Hardt, G. Vossen: Towards Class-
    less Object Models for Engineering Design Ap-
    plications ; Proc. 4th International Conference
     on Database and Expert Systems Applications
     (DEXA) 1993, Prag, Springer LNCS 720, 36{47
[15] R. Hull, K. Tanaka, M. Yoshikawa: Behav-
     ior Analysis of Object-Oriented Databases:
     Method Structure, Execution Trees, and Reach-
     ability; Proc. 3rd FODO Conference, Springer
     LNCS 367, 1989, 372{388
[16] T. Imielinski et al.: Incomplete Objects | A
     Data Model for Design and Planning Applica-
     tions; Proc. ACM SIGMOD International Con-
     ference on Management of Data 1991, 288{297