=Paper= {{Paper |id=Vol-1/paper-4 |storemode=property |title=Terminological systems revisited: terminology = schema + views |pdfUrl=https://ceur-ws.org/Vol-1/buchheit-et-al-long.pdf |volume=Vol-1 |authors=M. Buchheit,F.M. Donini,W. Nutt,A. Schaerf }} ==Terminological systems revisited: terminology = schema + views== https://ceur-ws.org/Vol-1/buchheit-et-al-long.pdf
                          Terminological Systems Revisited:
                           Terminology = Schema + Views
          M. Buchheit1 and F. M. Donini2 and W. Nutt1 and A. Schaerf2
   1. German Research Center for Arti cial Intelligence (DFKI), Saarbrucken, Germany
                             fbuchheit,nuttg@dfki.uni-sb.de
  2. Dipartimento di Informatica e Sistemistica, Universita di Roma \La Sapienza", Italy
                          fdonini,aschaerfg@assi.dis.uniroma1.it

                    Abstract                             relationship. This abstract architecture has been the
    Traditionally, the core of a Termino-                basis for the design of systems, the development of
    logical Knowledge Representation System              algorithms, and the investigation of the computa-
    (TKRS) consists of a so-called TBox, where           tional properties of inferences.
    concepts are introduced, and an ABox,                   Given this setting, there are three parameters that
    where facts about individuals are stated             characterize a terminological system: (i) the lan-
    in terms of these concepts. This design              guage for concept descriptions, (ii) the form of the
    has a drawback because in most applica-              statements allowed, and (iii) the semantics given to
    tions the TBox has to meet two functions             concepts and statements. Research tried to improve
    at a time: on the one hand, similar to a             systems by modifying these three parameters. But in
    database schema, framelike structures with           all existing systems and almost all theoretical studies
    typing information are introduced through            language and semantics have been kept uniform.1
    primitive concepts and primitive roles; on              The results of these studies were unsatisfactory in
    the other hand, views on the objects in the          at least two respects. First, it seems that tractable
    knowledge base are provided through de-              inferences are only possible for languages with lit-
      ned concepts.                                      tle expressivity. Second, no consensus has been
                                                         reached about the semantics of terminological cycles,
    We propose to account for this conceptual            although in applications the need to model cyclic
    separation by partitioning the TBox into             dependencies between classes of objects arises con-
    two components for primitive and de ned              stantly.
    concepts, which we call the schema and the              Based on an ongoing study of applications of ter-
    view part. We envision the two parts to              minological systems, we suggest to re ne the two-
    di er with respect to the language for con-          layered architecture consisting of TBox and ABox.
    cepts, the statements allowed, and the se-           Our goal is twofold: on the one hand we want to
    mantics.                                             achieve more conceptual clarity about the role of
    We argue that by this separation we                  primitive and de ned concepts and the semantics of
    achieve more conceptual clarity about the            terminological cycles; on the other hand, we want to
    role of primitive and de ned concepts                improve the tradeo between expressivity and worst
    and the semantics of terminological cycles.          case complexity. Since our changes are not primar-
    Moreover, three case studies show the com-           ily motivated by mathematical considerations but by
    putational bene ts to be gained from the             the way systems are used, we expect to come up with
    re ned architecture.                                 a more practical system design.
                                                            In the applications studied we found that the
1 Introduction                                           TBox has to meet two functions at a time. One is to
                                                         declare frame-like structures by introducing primi-
Research on terminological reasoning usually pre-        tive concepts and roles together with typing infor-
supposes the following abstract architecture, which      mation like isa-relationships between concepts, or
re ects quite well the structure of existing systems.    range restrictions and number restrictions of roles.
There is a logical representation language that allows   E.g., suppose we want to model a company environ-
for two kinds of statements: in the TBox or termi-       ment. Then we may introduce the concept Employee
nology , concept descriptions are introduced, and in     as a specialization of Person, having exactly one
the ABox or world description , individuals are char-    name of type Name and at least one aliation of
acterized in terms of concept membership and role        type Department. This is similar to class declara-
    This work was partly supported by the Commis-       tions in object-oriented systems. For this purpose, a
sion of the European Union under ESPRIT BRA 6810         simple language is sucient. Cycles occur naturally
(Compulog 2), by the German Ministry of Research and     in modeling tasks, e.g., the boss of an Employee is
Technology under grant ITW 92-01 (TACOS), and by            1 In [Lenzerini and Schaerf, 1991] a combination of
the CNR (Italian Research Council) under Progetto Fi-
nalizzato Sistemi Informatici e Calcolo Parallelo, LdR   a weak language for ABoxes and a strong language for
\Ibridi."                                                queries has been investigated.
also an Employee. Such declarations have no de ni-         ing.
tional import, they just restrict the set of possible
interpretations.                                           2 The Re ned Architecture
   The second function of a TBox is to de ne new           We start this section by a short reminder on concept
concepts in terms of primitive ones by specifying          languages. Then we discuss the form of statements
necessary and sucient conditions for concept mem-         and their semantics in the di erent components of
bership. This can be seen as de ning abstractions or       a TKRS. Finally, we specify the reasoning services
views on the objects in the knowledge base. De ned         provided by each component and introduce di erent
concepts are important for querying the knowledge          complexity measures for analyzing them.
base and as left-hand sides of trigger rules. For this
purpose we need more expressive languages. If cy-          2.1 Concept Languages
cles occur in this part they must have de nitional         In concept languages, complex concepts (ranged over
import.                                                    by C , D) and complex roles (ranged over by Q, R)
   As a consequence of our analysis we propose to          can be built up from simpler ones using concept and
split the TBox into two components: one for declar-        role forming constructs (see Tables 1 and 2 a set
ing frame structures and one for de ning views. By         of common constructs). The basic syntactic sym-
analogy to the structure of databases we call the          bols are (i) concept names, which are divided into
  rst component the schema and the second the view         schema names (ranged over by A) and view names
part. We envision the two parts to di er with re-          (ranged over by V ), (ii) role names (ranged over by
spect to the language, the form of statements, and         P ), and (iii) individual names (ranged over by a, b).
the semantics of cycles.                                   An interpretation I = (I ; I ) consists of the do-
   The schema consists of a set of primitive concept       main I and the interpretation function I , which
introductions, formulated in the schema language,          maps every concept to a subset of I , every role
and the view part by a set of concept de nitions, for-     to a subset of I  I , and every individual to an
mulated in the view language. In general, the schema       element of I such that aI 6= bI for di erent indi-
language will be less expressive than the view lan-        viduals a, b (Unique Name Assumption). Complex
guage. Since the role of statements in the schema          concepts and roles are interpreted according to the
is to restrict the interpretations we want to admit,       semantics given in Tables 1 and 2, respectively.
  rst order semantics, which is also called descriptive       In our architecture, there are two di erent con-
semantics in this context (see Nebel 1991), is ade-        cept languages in a TKRS, a schema language for
quate for cycles occurring in the schema. For cycles       expressing schema statements and a view language
in the view part, we propose to choose a semantics         for formulating views and queries to the system.
that de nes concepts uniquely, e.g., least or greatest
  xpoint semantics.                                        2.2 The Three Components
   The purpose of this work is not to present the          We rst focus our attention to the schema. The
full- edged design of a new system but to explore          schema introduces concept and role names and states
the options that arise from the separation of TBoxes       elementary type constraints. This can be achieved
into schema and views. Among the bene ts to be             by inclusion axioms having one of the forms:
gained from this re nement are the following three.                       A v D; P v A1  A2 ;
First, the new architecture has more parameters for
improving systems, since language, form of state-          where A, A1 , A2 are schema names, P is a role name,
ments, and semantics can be speci ed di erently for        and D is a concept of the schema language. Intu-
schema and views. So we found a combination of             itively, the rst axiom states that all instances of A
schema and view language with polynomialinference          are also instances of D. The second axiom states
procedures whereas merging the two languages into          that the role P has domain A1 and range A2 . A
one would have led to intractability. Second, we be-       schema S consists of a nite set of schema axioms.
lieve that one of the obstacles to a consensus about          Inclusion axioms impose only necessary conditions
the semantics of terminological cycles has been pre-       for being an instance of the schema name on the
cisely the fact that no distinction has been made          left-hand side. For example, the axiom \Employee v
between primitive and de ned concepts. Moreover,           Person" declares that every employee is a person,
intractability results for cycles mostly refer to infer-   but does not give a sucient condition for being an
ences with de ned concepts. We proved that rea-            employee.
soning with cycles is easier when only primitive con-         A schema may contain cycles through inclusion
cepts are considered. Third, the re ned architecture       axioms (see Nebel 1991 for a formal de nition).
allows for more di erentiated complexity measures,         So one may state that the bosses of an employee
as shown later in the paper.                               are themselves employees, writing \Employee v
   In the following section we outline our re ned ar-      8boss.Employee." In general, existing systems do
chitecture for a TKRS, which comprises three parts:        not allow for terminological cycles, which is a seri-
the schema, the view taxonomy, and the world de-           ous restriction, since cycles are ubiquitous in domain
scription, which comprise primitive concepts, de-          models.
  ned concepts and assertions in traditional systems.         There are two questions related to cycles: the rst
In the third section we show by three case studies         is to x the semantics and the second, based on
that adding a simple schema with cycles to existing        this, to come up with a proper inference procedure.
systems does not increase the complexity of reason-        As to the semantics, we argue that axioms in the
              Construct Name                   Syntax                   Semantics
              top                                 >                        I
              singleton set                      fag                      faI g
              intersection                      C uD                   C \ DI
                                                                          I
              union                             C tD                   C I [ DI
              negation                           :C                     I n C I
              universal quanti cation           8R.C fd1 j 8d2 : (d1 ; d2) 2 RI ! d2 2 C I g
              existential quanti cation         9R.:C   fd1 j 9d2 : (d1 ; d2) 2 RI ^ d2 2 C I g
              existential agreement            9Q = R fd1 j 9d2.(d1 ; d2) 2 QI ^ (d1; d2) 2 RI g
              number restrictions              ( n R)    fd1 j ]fd2 j (d1; d2) 2 RI g  ng
                                               ( n R)    fd1 j ]fd2 j (d1; d2) 2 RI g  ng
                       Table 1: Syntax and semantics of concept forming constructs.
               Construct Name             Syntax                    Semantics
               inverse role                P 1             f(d1; d2) j (d2 ; d1) 2 P I g
               role restriction           (R: C )    f(d1; d2) j (d1 ; d2) 2 RI ^ d2 2 C I g
               role chain                 Q  R f(d1; d3) j 9d2 .(d1; d2) 2 QI ^ (d2; d3) 2 RI g
               self                                          f(d1; d1) j d1 2 I g
                         Table 2: Syntax and semantics of role forming constructs.

schema have the role of narrowing down the mod-           9boss-of.BillsSuperBosses." But note that this does
els we consider possible. Therefore, they should be       not yield a de nition if we assume descriptive se-
interpreted under descriptive semantics, i.e., like in    mantics because for a xed interpretation of BILL
  rst order logic: an interpretation I satis es an ax-    and of the role boss-of there may be several ways
iom A v D if AI  DI , and it satis es P v A1  A2        to interpret BillsSuperBosses in such a way that the
if P I  AI1  AI2 . The interpretation I is a model      above equality holds. In this example, we only ob-
of the schema S if it satis es all axioms in S . The      tain the intended meaning if we assume least xpoint
problem of inferences will be dealt with in the next      semantics. This observation holds more generally: if
section.                                                  cycles are intended to uniquely de ne concepts then
   The view part contains view de nitions of the form     descriptive semantics is not suitable. However, least
                                                          or greatest xpoint semantics or, more generally, a
                        V =: C;                           semantics based on the -calculus yield unique de -
where V is a view name and C is a concept in the          nitions (see Schild 1994). Unfortunately, algorithms
view language. Views provide abstractions by de n-        for subsumption of views under such semantics are
ing new classes of objects in terms of the concept        known only for fragments of the concept language
and role: names introduced in the schema. We refer        de ned in Tables 1 and 2.
to \V = C " as the de nition of V . The distinc-             In this paper, we only deal with acyclic view tax-
tion between schema and view names is crucial for         onomies. In this case, the semantics of view de ni-
                                                          tions is straightforward. An interpretation I satis es
our architecture. It ensures the separation between
schema and views.                                         the de nition V =: C if V I = C I , and it is a model
   A view taxonomy V is a nite set of view de ni-         for a view taxonomy V if I satis es all de nitions in
tions such that (i) for each view name there is at        V.
most one de nition, and (ii) each view name oc-              A state of a airs in the world is described by as-
curring on the right hand side of a de nition has a       sertions of the form
de nition in V .                                                               C (a); R(a; b);
   Di erently from schema axioms, view de nitions
give necessary and sucient conditions. As an ex-         where C and R are concept and role descriptions in
ample of a view, one can describe the bosses of           the view language. Assertions of the form A(a) or
the employee Bill as the instances of \BillsBosses =:     P (a; b), where A and P are names in the schema,
9boss-of.fBILLg."                                         resemble basic facts in a database. Assertions in-
   Whether or not to allow cycles in view de ni-          volving complex concepts are comparable to view
tions is a delicate design decision. Di erently from      updates.
the schema, the role of cycles in the view part              A world description W is a nite set of asser-
is to state recursive de nitions. For example, if         tions. The semantics is as usual: an interpretation
we want to describe the group of individuals that         I IsatisI es CI(a) if aI 2 AI and it satis es R(a; b) if
are above Bill in the hierarchy of bosses we can          (a ; b ) 2 R ; it is a model of W if it satis es every
use the de nition \BillsSuperBosses =: BillsBosses t      assertion in W .
  Summarizing, a knowledge base is a triple  =            because usually the schema is much bigger than the
hS ; V ; Wi, where S is a schema, V a view taxonomy,       two views which are compared. Similarly, one might
and W a world description. An interpretation I is          be interested in the world description complexity of
a model of a knowledge base if it is a model of all        instance checking whenever one can expect W to be
three components.                                          much larger than the schema and the view part.
                                                              It is worth noticing that for every problem com-
2.3 Reasoning Services                                     bined complexity, taking into account the whole in-
For each component, there is a prototypical reason-        put, is at least as high as the other three. For exam-
ing service to which the other services can be re-         ple, if the complexity of a problem is O(jSjjVjjWj),
duced.                                                     its combined complexity is cubic, whereas the other
Schema Validation : Given a schema S , check               ones are linear. Similarly, if the complexity of a given
     whether there exists a model of S that inter-         problem is O(jSjjVj), both its combined complexity
     prets every schema name as a nonempty set.            and its view complexity are exponential, its schema
                                                           complexity is polynomial, and its world description
View Subsumption : Given a schema S , a view tax-          complexity is constant.
     onomy V , and view names V1 and V2 , check               In this paper, we use combined complexity to com-
     whether V1I  V2I for every model I of S and          pare the complexity of reasoning in our architec-
     V;                                                    ture with the traditional one. Moreover, we use
Instance Checking : Given a knowledge base , an           schema complexity to show how the presence of a
     individual a, and a view name V , check whether       large schema a ects the complexity of the reason-
     aI 2 V I holds in every model I of .                 ing services previously de ned. View and world de-
Schema validation supports the knowledge engineer          scription complexity have been investigated (under
by checking whether the skeleton of his domain             di erent names) in [Nebel, 1990, Baader, 1990] and
                                                           [Schaerf, 1993, Donini et al., 1994], respectively.
model is consistent. Instance checking is the basic
operation in querying a knowledge base. View sub-
sumption helps in organizing and optimizing queries        3 The Case Studies
(see e.g. Buchheit et al. 1994). Note that the schema      We studied some illustrative examples that show the
S has to be taken into account in all three services       advantages of the architecture we propose. We ex-
and that the view taxonomy V is relevant not only          tended three systems by a simple cyclic schema lan-
for view subsumption, but also for instance checking.      guage and analyzed their computational properties.
In systems that forbid cycles, one can get rid of S           As argued before, a schema language should at
and V by expanding de nitions. This is not possible        least be expressive enough for declaring subconcept
when S and V are cyclic.                                   relationships, restricting the range of roles, and spec-
2.4 Complexity Measures                                    ifying roles to be necessary (at least one value) or sin-
                                                           gle valued (at most one value). These requirements
The separation of the core of a TKRS into three            are met by the language SL, which was introduced
components allows us to introduce re ned complex-          in [Buchheit et al., 1994] and that is de ned by the
ity measures for analyzing the diculty of inferences.     following syntax rule:
   The complexity of a problem is generally measured
with respect to the size of the whole input. However,             C; D ! A j 8P .A j ( 1 P ) j ( 1 P ):
with regard to our setting, three di erent pieces of
input are given, namely the schema, the view tax-          Obviously, it is impossible to express in SL that a
onomy, and the world description. For this reason,         concept is empty. Therefore, schema validation in
di erent kinds of complexity measures may be de-           SL is trivial. Also, subsumption of concept names
  ned, similarly to what has been suggested in [Vardi,     is polynomially decidable.
1982] for queries over relational databases. We con-          We proved that inferences become harder for ex-
sider the following measures (where jX j denotes the       tensions of SL. If we add inverse roles, schema val-
size of X ):                                               idation remains trivial, but subsumption of schema
Schema Complexity : the complexity as a function           names becomes NP-hard. If we add any construct by
      of jSj;                                              which one can express the empty concept|like dis-
                                                           jointness axioms|schema validation becomes NP-
View Complexity : the complexity as a function of          hard. However, in our opinion this does not mean
      jVj;                                                 that extensions of SL are not feasible. For some ex-
World Description Complexity : the complexity as a         tensions, there are natural restrictions on the form
      function of jWj;                                     of schemas that decrease the complexity. Also, it
                                                           is not clear whether realistic schemas will contain
Combined Complexity : the complexity as a function         structures that require complex computations.
      of jSj + jVj + jWj.                                     In all the three cases studied, the schema lan-
   Combined complexity takes into account the              guage is SL. For the view language, we propose
whole input. The other three instead consider only a       three di erent languages derived from three actual
part of the input, so they are meaningful only when        systems described in the literature, namely the de-
it is reasonable to suppose that the size of the other     ductive object-oriented database system Concept-
parts is negligible. For instance, it is sensible to an-   Base [Jarke, 1992], and the terminological systems
alyze the schema complexity of view subsumption            kris [Baader and Hollunder, 1991] and classic
[Borgida et al., 1989]. We investigated the com-         in the knowledge base. The second alternative can
putational properties of the reasoning services with     be given a logical semantics in terms of epistemic
respect to SL-schemas. We aimed at showing two           operators (see Donini et al. 1992).
results: (i) reasoning w.r.t. schema complexity is al-
ways tractable, (ii) combined complexity is not in-
creased by the presence of terminological cycles in
                                                         References
the schema.                                              [Baader and Hollunder, 1991] Franz Baader and
   In all three cases, we assume that view names            Bernhard Hollunder. A terminological knowledge
are allowed in membership assertions and that the           representation system with complete inference al-
view taxonomy is acyclic. In this setting, every view       gorithm. In Proc. PDK-91, LNAI, pages 67{86,
name can be substituted with its de nition. For this        1991.
reason, from this point on, we suppose that view         [Baader, 1990] Franz Baader. Terminological cycles
concepts are completely expanded. Therefore, when           in KL-ONE-based knowledge representation lan-
evaluating the complexity, we replace the size of the       guages. In Proc. AAAI-90, pages 621{626, 1990.
view part by the size of the concept representing the    [Borgida et al., 1989] Alexander Borgida, Ronald J.
view.                                                       Brachman, Deborah L. McGuinness, and Lori
   We have found the following results for the three        Alperin Resnick. CLASSIC: A structural data
systems in which SL is the schema language and the          model for objects. In Proc. ACM SIGMOD, pages
concept language the abstraction of the query lan-          59{67, 1989.
guage of ConceptBase introduced in [Buchheit et          [Buchheit et al., 1994] Martin
al., 1994], or the language o ered by kris or clas-         Buchheit, Manfred A. Jeusfeld, Werner Nutt, and
sic, respectively.
                                                            Martin Staudt. Subsumption between queries to
ConceptBase: instance checking is in PTIME                  object-oriented databases. Information Systems,
      w.r.t. combined complexity (view subsumption          19(1):33{54, 1994.
      has been proved in PTIME in [Buchheit et al.,      [Donini et al., 1992] Francesco M. Donini, Maurizio
      1994]).                                               Lenzerini, Daniele Nardi, Werner Nutt, and An-
kris: view subsumption and instance checking are            drea Schaerf. Adding epistemic operators to con-
      PSPACE-complete problems w.r.t. combined              cept languages. In Proc. KR-92, pages 342{353,
      complexity and PTIME problems w.r.t. schema           1992.
      complexity.                                        [Donini et al., 1994] Francesco M. Donini, Maurizio
classic: view subsumption and instance checking             Lenzerini, Daniele Nardi, and Andrea Schaerf. De-
      are problems in PTIME w.r.t. combined com-            duction in concept languages: From subsumption
      plexity.                                              to instance checking. Journal of Logic and Com-
   We conclude that adding (possibly cyclic) schema         putation, 4(92{93):1{30, 1994.
information does not change the complexity of rea-       [Jarke, 1992] M. Jarke. ConceptBase V3.1 User
soning within the systems taken into account.               Manual. Aachener Informatik-Berichte 92-17,
                                                            RWTH Aachen, 1992.
4 Conclusion                                             [Lenzerini and Schaerf, 1991]
                                                            Maurizio Lenzerini and Andrea Schaerf. Concept
We have proposed to replace the traditional TBox            languages as query languages. In Proc. AAAI-91,
in a terminological system by two components: a             pages 471{476, 1991.
schema, where primitive concepts describing frame-
like structures are introduced, and a view part that     [Nebel, 1990] Bernhard Nebel. Terminological rea-
contains de ned concepts. We feel that this architec-       soning is inherently intractable. Arti cial Intelli-
ture re ects adequately the way terminological sys-         gence, 43:235{249, 1990.
tems are used in most applications.                      [Nebel, 1991] Bernhard Nebel. Terminological cy-
   We also think that this distinction can clarify the      cles: Semantics and computational properties. In
discussion about the semantics of cycles. Given the         John F. Sowa, editor, Principles of Semantic Net-
di erent functionalities of the schema and view part,       works, pages 331{361. Morgan Kaufmann, Los Al-
we propose that cycles in the schema are interpreted        tos, 1991.
with descriptive semantics while for cycles in the       [Schaerf, 1993] Andrea Schaerf. On the complexity
view part a de nitional semantics should be adopted.        of the instance checking problem in concept lan-
   In three case studies we have shown that the re-         guages with existential quanti cation. Journal of
vised architecture yields a better tradeo between           Intelligent Information Systems, 2:265{278, 1993.
expressivity and the complexity of reasoning.
   The schema language we have introduced might          [Schild, 1994] Klaus Schild. Terminological cycles
be sucient in many cases. Sometimes, however,              and the propositional -calculus. In Proc. KR-94,
one might want to impose more integrity constraints         1994.
on primitive concepts than those which can be ex-        [Vardi, 1982] M. Vardi. The complexity of relational
pressed in it. We see two solutions to this problem:        query languages. In Proc. STOC-82, pages 137{
either enrich the language and have to pay by a more        146, 1982.
costly reasoning process, or treat such constraints in
a passive way by only verifying them for the objects