=Paper=
{{Paper
|id=Vol-1/paper-3
|storemode=property
|title=Formalization of OODB models
|pdfUrl=https://ceur-ws.org/Vol-1/vossen-long.pdf
|volume=Vol-1
|authors=G. Vossen
}}
==Formalization of OODB models==
Formalization of OODB Models
Gottfried Vossen
Institut fur Wirtschaftsinformatik, Universitat Munster
Grevenerstrae 91, 48159 Munster
1 Introduction Class P
Object-oriented data models represent a current end- PPPmessg
point in the evolution of data models [23]. Their for-
type PPPP
malization has been attempted in a variety of papers, Type
Pq
Methods impl Messages
including [5; 6; 19]. This short paper indicates what
6
' $
we consider the common intersection of these (and
other) approaches; we list the relevant features and dom
components, and give an idea of how to formalize
the notion of an object-oriented database schema. ? inst
An object-oriented data model has to capture a Values
variety of requirements [8; 27], which di er consid- 6 ?
& %
erably from those that traditional data models have val Object
to meet. However, many system developers seem
not to care about formal models as a solid foun- @
dation of their system, but simply design a \data ? @@R ?
de nition language" in which the relevant features State oid Behavior
can be coded. In our opinion, a formal model for
object-oriented databases basically has to capture
the same intuitions as models for other types of da-
tabases, which are the following: Figure 1: Core Aspects of an Object Model.
1. It has to provide an adequate linguistic abstrac-
tion for certain database applications. Sbehav ); in what follows, we rst consider each com-
2. It should provide a precise semantics for a data ponent in isolation and then indicate how the two in-
de nition language. teract. We mention, however, that while it is gener-
3. It has to be composed of both a speci cation ally agreed that an object-oriented data model has to
and an operational part. capture both structure and behavior, the former can
be obtained by using the experience from the rela-
4. It represents a computational paradigm as a ba- tional, nested relational and complex-object models,
sis for formal investigations. but the latter represents a completely new challenge
In this short note, we do not present a comprehensive to database researchers. Consequently, a consensus
survey of formal models for object-oriented databa- seems achieved for structure, but not for behavior.
ses which have been proposed in the literature, but The core aspects of formal models for object-
instead try to point out the fundamentals of how oriented databases are summarized in Figure 1, in
such models are obtained. The result can be con- which labels of arrows represent function names. In
sidered as a framework in which the essentials of brief, the only structuring mechanism is the class
the object-oriented paradigm can be expressed con- which describes both structure and behavior for its
cisely and further studied. Indeed, we give hints to instances, the objects. Structure is captured as a
various such investigations that have recently been type for a class (in our notation, a function \type"
undertaken. associates a type with each class; the other function
names shown above are to be interpreted similarly,
2 Core Aspects of Formal OO see below). A type is nothing but a description of
a domain, i.e., a set of values, and may or may not
Models be named (in the former case, type names distinct
In this section, we describe what we perceive as the from class names and attribute names must be pro-
core aspects of various proposals for object mod- vided). Values comprise the state of an object and
els, and we do so by distinguishing structural from can be as complex as the type system allows (i.e.,
behavioral aspects. Thus, we generally consider depending on the availability of base types and con-
schemas, the central notion of any conceptual data- structors like tuple, set, bag, list, etc.). Behavior is
base description, to be pairs of the form S = (Sstruc , manifested in a set of messages associated with each
class (its external interface), which are internally im- (v) C T, where C is a nite set of class names.
plemented using methods that are executable on ob- This states nothing but the fact that class names
jects. Hence, objects have a state and a behavior; in are allowed as types (below we will complement this
addition, they are uniquely identi ed. Messages are with the requirement that classes themselves have
speci ed by providing a signature, and by associating types).
several signatures with the same message name, the The intuition behind this new condition is that ob-
latter gets overloaded. Not shown in Figure 1 is the jects from the underlying application all are distin-
possibility to organize classes in an inheritance hier- guished by their identity, get collected into classes,
archy; also not shown is the fact that class attributes and can reference other objects (share subobjects).
are allowed to reference other classes, thereby form- To provide for this at the level of domains, let us
ing an aggregation lattice. rst assume the availability of a nite set OID of
We next look at structural as well as behavioral object identi ers which includes the special identi-
aspects in more detail. Regarding the modeling of er nil (to capture \empty" references); next, ob-
structure, more precisely highly-structured informa- ject domains, i.e., sets of possible values for objects
tion, complex data types are all that is basically are complex values as above with the following ad-
needed, since they serve as descriptions for domains ditional condition:
of complex values. One way to introduce such types,
i.e., to de ne a type system T, is the following: (e) dom(c) = OID for each c 2 C.
(i) integer, string, float, boolean T; Thus, classes are assumed to be instantiated by ob-
(ii) if Ai are distinct attributes and ti 2 T, 1 i jects (class-name types take object identi ers as val-
n; then ues, in the same way as, say, the integer type takes
[A1 : t1 ; : : :; An : tn ] 2 T (\tuple type"); integer numbers as values). Clearly, this alone is not
enough, since class instances commonlyhave distinct
(iii) if t 2 T; then ftg 2 T (\set type"); sets of object identi ers associated with them. We
(iv) if t 2 T; then < t >2 T (\list type"). will show below how that (and, for example, the fact
In other words, a type system is made up of base that sometimes inclusion dependencies need to hold
types, from which complex types may be derived us- between sets of class instances) is captured at the
ing (eventually attributes and) constructors. Note instance level.
that this requires nothing additional but the avail- The object-oriented paradigm has another dimen-
ability of attribute names. Clearly, other base sion for organizing information besides aggregation,
types as well as additional or alternative construc- which is inheritance, or the possibility to de ne a
tors could straightforwardly be included. Notice also class as a specialization of one or more other classes.
that here types are not named; for practical reasons, To this end, a subtyping relation is needed through
the use of type names may be desirable (e.g., in order which it can be expressed that a subclass inherits
to be able to reuse type de nitions in various places the structure of a superclass. Such a relation can be
throughout a schema), and if it is, it can easily be de ned in various ways; for example, it can be de-
added to the above in the way indicated earlier. ned semantically by requiring that the sets of values
The notion of a domain as a \reservoir" of possible or instances of types, where one is a subtype of the
values can be de ned as follows; it just has to obey other, are in a subset relationship. We prefer a sim-
constructor applications: pler, syntactical approach, which has, for example,
the advantage that checking subtype relationships
(a) dom(integer) is the set of all integers; dom can be automated:
is analogously de ned for string, float, Let T be a set of object types. A subtyping rela-
boolean; tion T T is de ned as follows:
(b) dom([A1 : t1; : : :; An : tn]) := (i) t t for each t 2 T,
f[A1 : v1 ; : : :; An : vn ] j (8 i; 1 i n) vi 2
dom(ti)g; (ii) [A1 : t1; : : :; An : tn] [A1 : t1 ; : : :; Am : tm ]
0 0 0 0
(c) dom(ftg) := if
ffv1 ; : : :; vng j (8 i; 1 i n) vi 2 dom(t)g; (a) (8 Aj ; 1 j m)(9 Ai ; 1 i n) Ai =
0
(d) dom(< t >) := Aj ^ ti tj ,
0 0
f< v1 ; : : :; vn > j (8 i; 1 i n) vi 2 (b) n m,
dom(t)g. (iii) ftg ft g if t t ,
0 0
In a structurally object-oriented context, the rst (iv) < t > < t > if t t .
thing that needs to be introduced beyond complex
0 0
types and domains as de ned above is the possibil- With these preparations, we arrive at the follow-
ity to share pieces of information between distinct ing de nition for objectbase schemas that can de-
types, or to aggregate objects from simpler ones. At scribe structure of arbitrary complexity: A struc-
the level of type declarations, an easy way to model tural schema is a named quadruple of the form
this is the introduction of another reservoir of names, Sstruc = (C; T; type; isa) where
this time called class names, which are additionally (i) C is a ( nite) set of class names,
allowed as types. In other words, object types are
complex types as above with the following new con- (ii) T is a ( nite) set of types which uses as class
dition: names only elements from C,
(iii) type : C ! T is a total function associating a In combining structural and behavioral schemas,
type with each class name, we nally obtain an objectbase schema of the form
(iv) isa C C is a partial order on C which is S = (C; (T; type; isa;); (M; P; isa; messg; impl)).
consistent w.r.t. subtyping, i.e., S is called consistent if the following conditions are
c isa c ) type(c) type(c ) for all c; c 2 C.
0 0 0
satis ed:
This de nition resembles what can be found in (i) c isa c implies messg(c ) messg(c) for all
a variety of models proposed in the literature, in-
0 0
cluding [17; 19; 20; 25] and others. Notice that it c; c 2 C,
0
still leaves several aspects open, like single vs. mul- (ii) if c isa c and s; s 2 sign(m) for m 2 M such
0 0
tiple inheritance; if the latter is desired, a condition that s : ct1 : : :tn ! t, s : c t1 : : :tn !
0 0 0 0
needs to be added stating how to con icts should 0
t , then ti ti for each i, 1 i n, and t t ,
0 0
be resolved. Also, implementations typically add (iii) for each m 2 messg(c) there exists a c 2 C) 0
a number of additional features, like attributes as s.t. c isa c and impl(m; c ) is de ned.
0 0
functions [22; 29], a distinction of class attributes Condition (i) just says that subclasses inherit the be-
from instance attributes (the latter are shared by havior of their superclasses. Condition (ii) says that
all objects associated with a class, while the for- message-name overloading is done with compatible
mer represent, for example, aggregate information signatures, and is called the covariance condition in
like an average salary only relevant to the class as a [20; 9]. The covariance condition is a signi cant dif-
whole) [7], a unique root of the class hierarchy from ference from what is used at a corresponding point in
which every class inherits [20], a distinction between programming languages, and which is known as the
private and public attributes [12], a di erent set of contravariance condition; for a detailed explanation,
constructors (like one with an additional array con- see [9]. Finally, Condition (iii) states that for each
structor to describe matrices), an explicit inclusion message associated with a class, its implementation
of distinct types of relationships between classes and must at least be available in some superclass.
their objects (in particular various forms of composi- It is interesting to note that various natural con-
tion, see [18]), integrity constraints which represent ditions can be imposed on the programs that are
semantic information on the set of valid databases used as implementations of messages. We now sketch
instances (a proposal in that direction appears in [3; one of them, which is based on the view that pro-
4], where object constraints, class constraints, and grams are functions on domains [20]. More formally,
database constraints are distinguished). For another if m 2 M and s : c t1 : : : tn ! t 2 sign(m),
example, the ODMG-93 proposal for a standardized then impl(m; c), if de ned, is a program p 2 P of
model [10] contains explicit keys, (binary) relation- the form
ships, and inverse attributes. None of these features
appear in our model, the reason being that these are p : dom(c) dom(t1) : : : dom(tn) ! dom(t)
not speci c to object-orientation. The condition in question informally states that if
The second important aspect of an object-oriented message overloading appears in isa-related classes
database is that it is intended to capture behavior, (so that the corresponding signatures satisfy the co-
besides structure. To this end, the relevant intu- variance condition), then the associated programs
ition is that classes have attached to them a set of coincide (as functions) on the subclass. More for-
messages, which are speci ed in the schema via sig- mally, we have: If jsign(m)j > 1 for some m 2 M,
natures, and which are implemented as methods. In then the following holds: If s; s 2 sign(m) such that
0
addition, behavior can be inherited by subclasses, s : c t1 : : : tn ! t, s : c t1 : : : tn ! t ,
0 0 0 0 0
and message names can be overloaded, i.e., re-used c isa c , ti ti for each i, 1 i n, t t , and
0 0 0
in various contexts. impl(m; c) = p, impl(m; c ) = p , then p and p agree
0 0 0
So a behavioral schema is a named ve-tuple of on dom(c) dom(t1) : : : dom(tn).
the form Sbehav = (C; M; P; messg; impl) where A variety of formal investigations for behavioral
(i) C is a ( nite) set of class names as above (again schemata in the sense de ned above can already
needed here since references to it have to be be found in the literature, which investigate ques-
made), tions including termination of method executions,
(ii) M is a ( nite) set of message names, where each limited depth of method-call nestings (an issue re-
m 2 M has associated with it a nonempty set lated to precompilation of method executions), well-
sign(m) = fs1 ; : : :sl g, l 1, of signatures; each de nedness of method calls, i.e., consistency as well
sh , 1 h l, has the form sh : c t1 : : : tp ! as reachability considerations (issues related to type
t for c 2 C, t1 ; : : :; tp ; t 2 T inference and schema evolution), expressiveness of
(each signature has the receiver of the message method implementation languages (relative to some
as its rst component), notion of completeness), complexity of method exe-
(iii) P is a ( nite) set of methods or programs, cutions, or potential parallelism of method evalua-
tions. To investigate such issues, our general notion
(iv) messg : C ! 2M is a mapping s.t. for each of schema is made precise in various ways. For ex-
c 2 C and for each m 2 messg(c) there exists ample, [15] xes a simple imperative language for
a signature s 2 sign(m) satisfying s[1] = c, implementing methods as retrieval programs, con-
(v) impl: f(m; c) j m 2 messg (c)g ! P is a trasts them with update programs and shows un-
partial function. decidability results for the latter. [1; 2] as well as
[11] introduce distinct notions of a method schema persons living in a large country are repre-
to study behavioral issues of OODBS; for example, sented. In this context, so many combi-
[2] investigates implications of the covariance condi- nations of meaningful properties have to be
tion using the formalism of program schemas, while distinguished that it might become necessary
[11] looks at tractability guarantees corresponding to introduce arti cial name constructions for
to those known for relational query languages. Also, classes, like unmarried-nonstudent-autoOwner-
it is pretty straightforward to de ne an object alge- renter-taxpayer [26], and each such class has
bra for a model like the one sketched in the previous only very few instances. More generally, the
section; see, for example, the papers in [13]. That name space available for classes might not be
carries over to issues like query optimization, imple- sucient.
mentation of operations, and query processing. A 5. Objects and their classes might come into ex-
survey of other recent investigations that have simi- istence in reverse order. A database user in
lar bases or origins can be found in [28]. a design environment like CAD creates objects
We emphasize again that the model just sketched in the rst place, not type de nitions or even
can be seen as description of the core of vastly any classes. The usage of databases thus di ers con-
object-oriented model; however, this is valid only rel- siderably from traditional applications where
ative to the fact that many specialities, which have schema design has to be completed prior to in-
been proposed in the literature, or which are being stance creation.
built into commercial systems, are neglected here.
We conclude this section with a brief indication of We mention that one issue or the other from this
how object databases, i.e., sets of class instances or list is sometimes re ected already in existing mod-
extensions, can be de ned over a given schema: For els, but never as a basic design target. Alternative
a given objectbase schema S, an objectbase over S is approaches, which takes these issues into considera-
a triple d(S) = (O, inst, val) s.t. tion right from the start, appear, for example, in [21;
(i) O OID is a nite set of object identi ers, 24; 16]. A possible general concept for the solution
of these problems seems the exploitation of proto-
(ii) inst: C ! 2O is a total function satisfying the type languages, which suggest to model applications
following conditions: without a classi cation that partitions the world into
(a) if c; c 2 C are not (direct or indirect) sub- entity sets. A prototype represents default behavior
0
classes of each other, for some concept, and new objects can re-use part
then inst(c) \ inst(c ) = ;,
0 of the knowledge stored in a prototype by saying
(b) if c isa c , then inst(c) inst(c ),
0 0 how they di er from it. Upon receiving a message
(iii) val: O ! V is a function s.t. an object does not understand, it can forward (del-
(8 c 2 C) (8 o 2 inst(c)) val(o) 2 dom(type(c)). havior.itIntothe
egate) its prototype to invoke more general be-
area of object-oriented programming
Notice that this de nition closes the problem left languages, many people believe that this approach
open earlier, namely that class domains originally has advantages over the class-based one with inher-
were simply the set OID. itance, with respect to the representation of default
knowledge and incrementally and dynamically modi-
3 Open Issues fying concepts. The investigation of classless models
in the context of object-oriented databases has only
We next survey several modeling issues in object- recently been proposed in [26], and a concrete model
oriented databases which have not yet received is reported in [14].
enough research attention:
1. Entities can have roles that vary over time. For 4 Conclusions
example, some person object may at one point
be a student, at another an employee, and at a In this short paper we have tried to give a rough
third a club member; while the person's identity personal account of recent work on formal models
never changes, its type changes several times. for object-oriented databases. Although there is not
2. Entities can have multiple types at the same a single uniform such model, the foundations on
time. For example, a person may be a stu- which such models have to be built seem understood,
dent, an employee, and a club member simul- and even standardization e orts have recently been
taneously. So far the only way to represent this launched [10]. On the other hand, a number of in-
in an object-oriented database is by multiple teresting research issues still deserve further investi-
inheritance, but this might not be appropriate gation. In particular, formal models as they are cur-
since it can result in a combinatorial explosion rently available seem hardly suited for the nonstan-
of sparsely populated classes [21]. dard applications which initiated the consideration
of object-orientation in the context of databases. A
3. Objects can be in various stages of development. reason seems to be that many researchers have too
For example, in a design environment it is usu- much of a relational background, and try to exploit
ally necessary to maintain incomplete designs, that as long as possible; this is more than con rmed
i.e., objects whose types get completed in the by the ODMG-93 proposal. As was done a number
course of time. of years ago, when database people discovered what
4. Classes may contain \too few" instances. For programming-language or knowledge-representation
example, consider a database in which all people had been studying for years already, it seems
again necessary to take recent developments in these [17] A. Kemper et al.: GOM: A Strongly Typed Per-
areas into account, and to adopt them for solving the sistent Object Model with Polymorphism; Proc.
problems database applications have. German GI Conference on \Datenbanken fur
Buro, Technik und Wissenschaft" (BTW) 1991,
References Springer Informatik-Fachbericht 270, 198{217
[1] S. Abiteboul, P.C. Kanellakis: The Two Facets [18] W. Kim: Introduction to Object-Oriented Data-
of Object-Oriented Data Models; IEEE Data bases ; MIT Press 1990
Engineering Bulletin 14 (2) 1991, 3{7 [19] C. Lecluse et al.: O2, an Object-Oriented
[2] S. Abiteboul, P.C. Kanellakis, E. Waller: Data Model; Proc. ACM SIGMOD Interna-
Method Schemas; Proc. 9th ACM Symposium tional Conference on Management of Data 1988,
on Principles of Database Systems 1990, 16{27 424{433
[3] P.M.G. Apers et al.: Inheritance in an Object- [20] C. Lecluse, P. Richard: Foundations of the O2
Oriented Data Model; Memoranda Informatica Database System; IEEE Data Engineering Bul-
90-77, University of Twente 1990 letin 14 (2) 1991, 28{32
[4] H. Balsters et al.: Sets and Constraints in an [21] J. Richardson, P. Schwarz: Aspects: Extend-
Object-Oriented Data Model; Memoranda In- ing Objects to Support Multiple, Independent
formatica 90-75, University of Twente 1990 Roles; Proc. ACM SIGMOD International Con-
[5] F. Bancilhon, C. Delobel, P. Kanellakis (eds.): ference on Management of Data 1991, 298{307
Building an Object-Oriented Database System [22] M.H. Scholl, H.J. Schek: A Relational Object
| The Story of O2. Morgan-Kaufmann 1992 Model; Proc. 3rd International Conference on
[6] C. Beeri: A Formal Approach to Object- Database Theory 1990, Springer LNCS 470, 89{
Oriented Databases; Data & Knowledge Engi- 105
neering 5, 1990, 353{382 [23] H.J. Schek, M.H. Scholl: Evolution of Data
[7] E. Bertino et al.: An Object-Oriented Data Models; Proc. Database Systems of the 90s,
Model for Distributed Oce Applications; November 1990, Springer LNCS 466, 135{153
Proc. ACM Conference on Oce Information [24] E. Sciore: Object Specialization; ACM Transac-
Systems 1990, 216{226 tions on Information Systems 7, 1989, 103{122
[8] E. Bertino, L. Martino: Object-oriented Data- [25] D.D. Straube, M.T. O zsu: Queries and Query
base Management Systems: Concepts and Is- Processing in Object-Oriented Database Sys-
sues; IEEE Computer 24 (4) 1991, 33{47 tems; ACM Transactions on Information Sys-
[9] E. Bertino, L. Martino: Object-Oriented Data- tems 8, 1990, 387{430
base Systems; Addison-Wesley 1993 [26] J.D. Ullman: A Comparison of Deductive and
[10] R.G.G. Cattell (ed.): The Object Database Object-Oriented Database Systems; Proc. 2nd
Standard: ODMG-93. Morgan-Kaufmann 1994 DOOD Conference, Springer LNCS 566, 1991,
[11] K. Denningho , V. Vianu: The Power of Meth- 263{277
ods with Parallel Semantics; UCSD Technical [27] G. Vossen: Datenmodelle, Datenbanksprachen
Report No. CS91{184, University of California, und Datenbankmanagement-Systeme; 2. Au-
San Diego, February 1991; extended abstract in age, Addison-Wesley 1994
Proc. 17th Int. Conference on Very Large Data
Bases 1991, 221{232 [28] G. Vossen: Database Theory: An Introduction;
[12] O. Deux et al.: The Story of O2; IEEE Trans- Technical Report, University of Munster, June
actions on Knowledge and Data Engineering 2, 1994
1990, 91{108 [29] K. Wilkinson et al.: The Iris Architecture and
[13] J.C. Freytag, D. Maier, G. Vossen: Query Implementation; IEEE Transactions on Knowl-
Processing for Advanced Database Systems; edge and Data Engineering 2, 1990, 63{75
Morgan-Kaufmann 1994
[14] M. Gro-Hardt, G. Vossen: Towards Class-
less Object Models for Engineering Design Ap-
plications ; Proc. 4th International Conference
on Database and Expert Systems Applications
(DEXA) 1993, Prag, Springer LNCS 720, 36{47
[15] R. Hull, K. Tanaka, M. Yoshikawa: Behav-
ior Analysis of Object-Oriented Databases:
Method Structure, Execution Trees, and Reach-
ability; Proc. 3rd FODO Conference, Springer
LNCS 367, 1989, 372{388
[16] T. Imielinski et al.: Incomplete Objects | A
Data Model for Design and Planning Applica-
tions; Proc. ACM SIGMOD International Con-
ference on Management of Data 1991, 288{297