=Paper= {{Paper |id=Vol-40/paper-8 |storemode=property |title=Using RDF(S) to provide multiple views into a single ontology |pdfUrl=https://ceur-ws.org/Vol-40/Toivonen.pdf |volume=Vol-40 }} ==Using RDF(S) to provide multiple views into a single ontology== https://ceur-ws.org/Vol-40/Toivonen.pdf
                                         Using RDF(S) to provide multiple
                                           views into a single ontology
                                                                           Santtu Toivonen
                                                                      Sonera Corp., Research
                                                                           P.O. Box 145
                                                                     FIN-00051 Sonera, Finland
                                                                        (+358) 40 723 7217
                                                               santtu.toivonen@sonera.com

ABSTRACT                                                                                 1.2 Technologies with significance to the
This paper deals with RDF (Resource Description Framework).
The main point is to present a general model describing when
                                                                                         model proposed here
                                                                                         RDF(S) technology aims at describing web resources. It is under
and how to exploit RDF technology. It is suggested that RDF(S) 1                         development and standardization in the World Wide Web
functions best as a means to provide mechanisms for expressing                           Consortium. RDF is specified in two separate documents, one
contextual and case-specific information. In other words, RDF(S)                         about model and syntax of RDF [15] and the other about RDF
is suitable for providing different views into a single extensive                        schemas [4].
ontology, rather than specifying the actual ontology. The ontology
"behind" the case-specific RDF(S) is likely to be expressed using                        XML is one proposed representation format for RDF statements.
some other mechanism than RDF(S).                                                        Of the large amount of technologies in the XML family at least
                                                                                         XML Namespaces [3] and XML Schema [1, 9 and 20] are
Keywords                                                                                 relevant with respect to RDF. Namespaces are needed in RDF(S)
Ontologies, concepts, domain-specificity, resources, properties,                         because they help identifying the particular domains and
classes.                                                                                 modeling layers [19]. Furthermore, the particular RDF schema
                                                                                         that is used for validating different RDF documents is identified
                                                                                         using namespace notation. XML schema technology is needed for
1. INTRODUCTION                                                                          syntactic validation of RDF documents that are in XML format.
1.1 Nature and scope of the paper                                                        There are some differences between validation in XML and RDF
This paper is theoretical and methodological in its nature. It is                        [5]. Validation through RDF schemas grounds mainly on
theoretical since applications and implementation-specific details                       semantics, i.e. the meaning-based hierarchy and relations among
are excluded. It is methodological since it concentrates on the                          the concepts to be defined. XML schemas perform syntactic
proper usage of RDF(S).                                                                  validation instead; they concentrate on the grammar of the XML
This paper introduces a simple model on how to exploit RDF(S)                            documents [7]. There is some semantics in XML schema
in large and heterogeneous environments that include several                             technology, like the usage of datatypes, but compared with RDF
different applications. The opinion is that RDF(S) has a lot of                          schemas it is best thought of as a syntactic validation mechanism.
useful features in describing resources, but also some drawbacks.
After the model is presented, the possibilities as well as                               2. OVERVIEW OF THE MODEL
limitations of RDF(S) are discussed.                                                     This chapter presents the general structure of the proposed
                                                                                         model. The motivation is to familiarize the reader with different
                                                                                         parts of the model.
                                                                                         Figure 1 presents the overview of the model and illustrates the
                                                                                         role of RDF(S). Unlike in [5, 7, and 19], RDF(S) is not intended
                                                                                         to cover the whole semantic categorization in the environment2. It
    Permission to make digital or hard copies of all or part of this work for            is rather intended as a mechanism to provide domain-specific
    personal or classroom use is granted without fee provided that copies are            data related to small-scale tasks. An individual RDF document as
    not made or distributed for profit or commercial advantage and that copies
    bear this notice and the full citation on the first page. To copy otherwise, to
    republish, to post on servers or to redistribute to lists, requires prior specific
    permission by the author.
    Semantic Web Workshop 2001 Hongkong, China
                                                                                         2
                                                                                             Note that in [5, 7, and 19] standard RDF(S) is extended with a
    Copyright by the author.                                                                 language called OIL (Ontology and Inference Language). Also
                                                                                             DAML (DARPA Agent Markup Language) [14] extends
                                                                                             RDF(S). What is proposed here, is different. Here the basic
                                                                                             mechanisms of RDF(S) are thought to be such that RDF(S) (or
1
     RDF(S) refers to combined technologies of RDF and RDF-                                  any system with similar internal structure) is not suitable for
     Schema. Cf. [19].                                                                       describing a potentially large ontology.
well as an RDF schema document consists of a set of concepts        environment. And there should be only one ontology in one
that is likely to be subset of the concepts in the ontology.        environment.
                                                                    One important property of an ontology is extensibility [11]. It
                                                                    should be possible to introduce new concepts into an existing
                                                                    ontology so that the applications utilizing the ontology stay
                                                                    unbroken. In this model it is entirely possible to extend the
                                                                    ontology with new concepts since the users of the ontology
                                                                    operate using the RDF(S) that they themselves have defined. In
                                                                    the ontology all the concepts have similar ontological statuses3.
                                                                    RDF schemas and RDF documents together provide different
                                                                    views into the ontology.
                                                                    2.2 RDF Schemas
                                                                    RDF schemas are intended to function in a roughly similar role
                                                                    than DTD's function for XML documents. Individual RDF
                                                                    documents are validated against some RDF schema. RDF
                                                                    Schema specification [4] has defined a number of worthwhile
                                                                    concepts to be used when validating RDF documents. They are
                                                                    now presented briefly, since understanding their hierarchy and
                                                                    interrelations is important for the model presented in this paper.
                                                                    At the topmost level the concepts are divided into three
          Figure 1. Overview of the proposed model.                 categories: rdfs:Resource, rdfs:Class and rdf:Property.
                                                                    Two important properties, rdf:type and rdfs:subClassOf,
Figure 1 is now examined from right to left. The rightmost          are needed in order to express the relationships among these
section of the picture denotes ontology, the most general           concepts. Resource is the topmost class of the RDF system.
description of the environment in question. The next two sections   Everything else is describable as a subclass of resource. Type-
are in the core focus of this paper. RDF schemas are seen as        property is needed in order to express that each resource is a
domain-specific validating filters. RDF documents are relatively    member of a class. A property is a specific aspect, characteristic,
small pieces of information that are validated against RDF          attribute, or relation used to describe a resource [15]. With
schemas. And finally, users are the ones that utilize RDF(S) as     respect to this paper, the division between properties and other
their source of knowledge when working in the environment.          resources is crucial 4.
Users can be software agents as well as human beings.               The RDF schema specification [4] defines two important
                                                                    constraint properties: rdfs:range and rdfs:domain. These
2.1 Ontology                                                        constraints are used only within RDF schemas; they do not
Details of the ontology are outside the scope of this particular    appear in other RDF documents. The domain constraint indicates
paper; ontology is treated here as a "black box". It could be       that a property may be used along with the resources of a certain
implemented for example as a semantic network or a tree             class. For example, author is a property that could originate
structure.                                                          from a resource that is an instance of class book. A property may
The approach in this paper favors the adoption of one big shared    have zero, one, or more than one class as its domain.
ontology as opposed to several smaller ones. It is acknowledged     Range, on the other hand, is something more rigorous; it
that this shared ontology might expand and become too slow and      specifies the class that the value of the property in question
complicated to use. Additional limitations include the complexity   should be a resource of [4]. For example, a range constraint
and slowness of defining standards needed for one large             applying to the author property might express that the value of
heterogeneous ontology [6].                                         an author must be a resource of class person. A property can
First one of the problems of the shared ontology approach, the      have at most one range property.
slowness of using a large ontology, can be eliminated with RDF
schemas. With RDF schemas it is possible to specialize the users
of the ontology to be task-specific experts; they do not have to
know every bit of information about the environment. The
                                                                    3
                                                                      This does not necessarily mean that the ontology is totally flat;
problems with defining standards for large ontologies are outside     there can naturally be some very general hierarchies among the
the scope of this paper.                                              concepts in the ontology. For example subclass - superclass -
                                                                      relation is something that can be said to hold between certain
The size and magnitude of the environment is a relevant question      concepts regardless of the case-specific details.
within the limits of this paper; how large and heterogeneous is
the environment supposed to be? Instead of concentrating on
                                                                    4
                                                                      Every property is a resource and also a member of some class.
domains, disciplines, business branches, etc., the concept of         In this paper properties are nevertheless often contrasted with
environment is used here along the following guideline: if two        classes and resources. The reason for this is to differentiate the
applications share one or more concepts, they belong in the same      concepts that get defined from those that participate in defining
                                                                      them.
2.3 RDF Documents                                                      the users of the model proposed here can consist of software
                                                                       agents in addition to human beings. For example in FIPA
Individual RDF documents are validated against RDF schemas.
RDF documents consist of descriptions, which in turn consist of        (Foundation for Intelligent Physical Agents) there is already
statements. Each description represents some resource. Each            work done around the usage of RDF as a content language for
statement represents some feature of the resource that is being        software agents [10].
described.                                                             The semantic information that software agents use should be
In RDF schemas the interrelations among selected resources and         mainly external to the agents themselves [21]. Agents should
properties are defined. RDF documents contain naturally more           have access to an external ontology describing the general
specific and case-related data than RDF schemas; in RDF                structure of the environment. The ontology would constitute an
documents properties and resources are given values and thereby        independent repository of information. This way the agents
the ontological system is tied to actual instances of the resources.   themselves would not become "walking encyclopaedias" (cf. [8])
Following is a simple example of RDF in an XML syntax:                 but remain relatively simple.
                                                                       This semantic information external to the agents is distributed to
                                                   utilize individual RDF documents as pieces of case-specific
                                                their internal inference rules and the RDF schema/schemas they
                        Also human users of the environment benefit from this model.
         Martin Scorsese                                               The model helps people to understand different parts of the
                                                                       environment and applications appearing in it. For example, if
     
                                                                       someone decides to introduce a new service or resource into the
                         resources (assuming that the schemas are publicly available for
         Robert De Niro                                                examination).
                                                            If he finds a suitable schema, he can utilize it as a means to
                         developer, there still might be some guidelines or pieces of
         114                                                           information in existing schemas that help the developer to get
                                                                       started. Either way, people working in an environment with
                                                              shared ontology can reduce the amount of work with public RDF
                                                         schemas and this way avoid re-inventing the wheel.

                                                                       3. USAGE OF RDF(S)
Here the mpg-version of Taxi Driver is presented as an RDF             3.1 RDF and web resources
resource. It has three properties: director, starring and length.      Again: RDF is intended to provide metadata about web
These are specified in individual statements. This document is         resources. Different web resources naturally have different
validated against the RDF schema of an imaginary web video             means of categorization. For example, libraries, video stores and
service called Santtu's Videos. Here all the statements refer to       digital phone books use different concepts as metadata [2]. There
the same schema but this is not necessary. Features of a given         are nevertheless some common aspects among all of these. First,
resource could be defined in separate schema documents.                they all have resources, be they books, movies or phone book
                                                                       entries. Second, they all have properties characterizing the
The shared ontology approach is favored in this paper over the         resources. Movies, for example, have actors, directors, length of
multiple ontologies approach [6]. Nonetheless the usage of             the movie, etc. Third, the resources may be grouped into classes.
RDF(S) adopts some features from the multiple ontologies point         There might be a class called movies and it might have a
of view. RDF documents and schemas are often organized into a          subclass called horror movies.
hierarchy and descriptions might specialize other descriptions
defined in other RDF(S)'s using rdfs:subClassOf and                    3.2 Properties in RDF(S)
rdfs:subPropertyOf properties. It is good to keep in mind,             RDF(S) is in this paper proposed as a means to provide case-
however, that RDF(S) is treated as a means to provide views into       specific information rather than means to constitute the whole
ontologies, rather than specifying the actual ontologies.              ontology. The reason for this reduces to the question concerning
                                                                       the ontological status of properties. In [15] properties are
2.4 Users                                                              described the following way: "A property is a specific aspect,
RDF(S) is intended to provide metadata about web resources that        characteristic, attribute, or relation used to describe a resource".
is both human-readable and machine-understandable [15]. Hence
RDF(S) properties are hereby qualifiers that characterize some         way to declare that some concept functions always as definiens
resources. They have a clearly different ontological status than       while some other is always definiendum. That is why concepts
classes, for example. Classes are something that are defined           should not be universally placed in either of these categories. In
(definienda, sing. definiendum), properties are something that         the end all concepts are similar with respect to their ontological
participate in defining them (definientia, sing. definiens). This is   statuses. RDF(S) is a technology with no good conventions that
fully acceptable as long as the case-specificity and contextuality     help coping with this matter.
of the model is kept in mind. Depending on the context, concept        Of course it is possible to introduce all (or at least majority of)
c1 can be an attribute of c2 and vice versa [17].                      concepts twice; once with the status of definiendum and again
In other words: Concepts that are definienda in some case or           with the status of definiens. However, this is not a desirable
application form the basic level of concepts for that particular       solution. It leads to compatibility problems and violates the
case. In RDF(S) terminology these would be classes to be               simplicity principle of ontologies. The principle states that there
defined. There are two other levels in addition: subordinate and       should be as few ontological commitments in an ontology as
superordinate level. Definiens (property in RDF(S) terminology)        possible [11]. Introducing all concepts twice would cause a
is at the subordinate level when compared with definiendum.            situation found in Figure 3.
Depending on the domain, however, relations between these
levels vary (cf. [18 and 16]).
3.3 An example characterizing the case-
specificity of RDF(S)
From the electronic video store's point of view director is a
property that characterizes the resources of a class called movie.
This is fine as long as it is clear that somewhere else director
could appear also as a class that gets defined by some other
properties. An electronic catalog of artists might have director as
a class5. Now directors could have movies that they have directed
as their properties. Just the other way around than in the video
store6. This is illustrated in Figure 2.




                                                                                 Figure 3. Usage of concepts in different cases.

                                                                       First the RDF documents column of the Figure 3 is examined. It
                                                                       tells us that in video store the movie Taxi Driver has a property
                                                                       named director with the value "Scorsese". Artist catalog, on the
                                                                       other hand, has the director Martin Scorsese with a property
                                                                       named director that has the value "Taxi Driver". So Taxi Driver
                                                                       and Scorsese both appear once as resources to be defined and
                                                                       once as properties. The same thing concerns the RDF schemas
                                                                       column of the picture. In video store director is a property that
                                                                       belongs in the domain7 of movies. In artist catalog the situation is
                                                                       contrary.
     Figure 2. An example about the domain-specificity.

When representing the world, the structure of concepts should be
as analogous to reality as possible [17]. And there is no a priori
                                                                       7
                                                                           For the sake of simplicity only one constraint property is
                                                                           presented here. Besides domain, also range is a useful property
                                                                           to be exploited in RDF schemas. In the schema of video store,
                                                                           for example, there could be a range constraint property named
5
   Artist catalog would probably have artist as a basic level              person attached to the director property. This would mean that
  concept and director as subordinate level concept. However, in           the value of the director property is always a member of the
  RDF(S) terminology these would both be classes, not                      class person. Furthermore, only one domain for each property
  properties.                                                              is presented. If necessary, though, director could have other
6
  In principle even two different video stores could interpret the         domains besides movies. It could be attached to TV-series,
  hierarchy of some set of concepts variously.                             theatre plays, etc.
A phenomenon closely related to this is observed in [13];              director as its value. Director would have has_directed property
different RDF schemas can specialize some class defined in             that would have a movie as its value.
another       schema        with      rdfs:subClassOf            and   At first sight this might seem wise. More carefully examined,
rdfs:subPropertyOf properties. They can use the same name              however, this leads to a situation not preferable to defining every
but different definitions for that class in their own                  concept twice; once as a property and once as a class. The
specializations. So there could be an upper RDF schema that has        number of properties would be doubled as shown in figure 4;
movies, directors, etc. all as classes. However, when electronic       has_directed and directed_by would both exist between a movie
video store and electronic artist catalog specialize the classes in    and its director even though they have the same information
their own unique ways, the system as a whole becomes                   content. This would again violate the simplicity principle of
incoherent. This is one of the basic drawbacks of the multiple         ontologies [11].
ontologies approach.
One remark here could be that since RDF is intended for
describing web resources, the movie Taxi Driver (at least in
mpg-format as in the code example presented earlier) is more
appropriate candidate for a web resource than the director Martin
Scorsese. That is because Martin Scorsese can not appear in a
format distributed in the Internet unlike Taxi Driver.
Ontologically speaking, however, the movie Taxi Driver is not
the same entity as the mpg-version of it distributed in the net. It
is rather an abstract thing that has different instances. Compared
with object-oriented programming, the movie Taxi Driver would
be a class and the copies of that movie (for example the mpg-
version distributed in the electronic video store) in turn instances
of the class.                                                                       Figure 4. Duplicating the properties
4. CONCLUSION
The expressive power of RDF(S) does not necessarily complete           Yet another attempt to overcome the problem of classes versus
all the parts that are needed for expressing a semantic                properties is reacting to it at levels residing on top of RDF. OIL
description of some system8. An ontology independent of the            (see [5, 7, and 19]) and DAML (see [14]) are examples of
domain-specific details of its usage is needed. There should be        languages that are on a higher level than RDF.
an "isolated basic backbone" of ontology that is independent of        However, introducing rules and restrictions that cope with
any case-specific details [12]. And based on the arguments and         limitations of RDF at a higher level does not seem feasible. For
examples presented here, it should be clear that RDF(S) alone          one thing, this again violates the simplicity principle of
does not fit together with this requirement.                           ontologies [11]; for each ambiguous class-property distinction at
What RDF(S) technology can do, however, is to provide means to         the RDF level there would exist a fixing principle at a higher
access an ontology characterizing some environment – no matter         level. Secondly, the whole idea of coping with problems of some
how large or heterogeneous – in many ways.                             level at another is not desirable; each level should be clear
                                                                       enough not to require fixing or configuring at other levels.
5. DISCUSSION
                                                                       5.2 Deducing the ontology from RDF(S)
5.1 Rethinking the properties                                          Here the ontology "behind" the RDF(S) is treated as a "black
The main problem of RDF(S) presented in this paper is the              box". Its detailed structure is not discussed. It could however be
division of the concepts into properties and classes. The answer       possible (even in the model proposed in this paper) that the
proposed to this problem is the usage of an external ontology in       whole ontology is deducable from the total amount of RDF
addition to the RDF(S). From the ontology's point of view the          documents and schemas in a given environment. This depends on
usage of concepts in different RDF(S) is based on roles [12];          the interpretation of domain-specificity.
rdfs:Resource, rdfs:Class and rdf:Property are
different roles of some concept defined in the ontology.               If all the concepts and their interrelations are such that they are
                                                                       found in the ontology it could be possible to make that deduction.
Another attempt to resolve this would be to reformulate the            Possible conflicts should however try to be avoided. If some
properties. Earlier an example of using director as a property         concept is a property (definiens) in one schema and a resource to
belonging in the domain of movie in one place and movie as a           be defined (definiendum) in another, does that have any impact
property belonging in the domain of director in another was            on the ontology? If it does, which one of the schemas determines
presented. Why not use directors and movies always as classes?         the "ontological" location of the concept in question. If it does
Movie would have a property directed_by that would have a              not, how it is possible to construct any hierarchy in the ontology
                                                                       (since there would be nothing in addition to the schemas)? On
                                                                       the other hand, if there are some general relations or attributes at
8
    See nevertheless "Discussion" for possibilities to deduce the      the ontological level that are not visible in RDF(S), the deduction
    ontology from RDF(S).                                              is not possible.
Clearly a deduction in the other direction is not possible. There   [14] Hendler, J. and McGuinness, D.L. The DARPA Agent
is no way of knowing how the concepts in the ontologies are used         Markup Language. In IEEE Intelligent Systems, Vol. 15,
and grouped in different RDF(S). This means that it is not               No. 6, November/December 2000, 67-73.
possible to deduce all imaginable RDF(S) just by examining the
                                                                    [15] Lassila, O., and Swick, R.R. Resource description
ontology. And this is due to the proposed case-specific nature of
RDF(S).                                                                  framework (RDF) model and syntax specification. Technical
                                                                         report, W3C, 1999. W3C Recommendation.
Acknowledgements. Thanks to Joose Niemistö and Johannes                  http://www.w3.org/TR/REC-rdf-syntax.
Gröhn for their help on this article.
                                                                    [16] Murphy, G.L., and Lassaline, M.E. Hierarchical Structure in
6. REFERENCES                                                            Concepts and the Basic Level of Categorization. In Koen
[1]  Biron, P.V. and Malhotra, A. XML Schema Part 2:                     Lamberts and David Shanks (eds.): Knowledge, Concepts,
     Datatypes. Technical Report, W3C, 2001. W3C Proposed                and Categories, 93-131. Hove: Psychology Press, 1997.
     Recommendation. http://www.w3.org/TR/xmlschema-2/.             [17] Saariluoma, P. Foundational analysis: Presuppositions in

[2] Bray, T. RDF and Metadata. 1998.
                                                                         experimental psychology. London: Routledge, 1997.
     http://www.xml.com/xml/pub/98/06/rdf.html.                     [18] Saeed, J.I. Semantics. Oxford: Blackwell Publishers Ltd,

[3] Bray, T., Hollander, D., and Layman, A. Namespaces in
                                                                         1997.
     XML. Technical report, W3C, 1999. W3C                          [19] Staab, S., Erdmann, M., Maedche, A., and Decker, S. An
     Recommendation. http://www.w3.org/TR/REC-xml-names.                 Extensible Approach for Modeling Ontologies in RDF(S). In
[4] Brickley, D., and Guha, R.V. Resource description
                                                                         First Workshop on the Semantic Web at the Fourth
     framework (RDF) schema specification. Technical report,             European Conference on Digital Libraries, Lisbon, Portugal,
     W3C, 2000. W3C Candidate Recommendation.                            2000.
     http://www.w3.org/TR/rdf-schema.                               [20] Thompson, H.S., Beech, D., Maloney, M., and Mendelsohn,

[5] Broekstra, J., Klein, M., Decker, S., Fensel, D., van                M. XML Schema Part 1: Structure. Technical Report, W3C,
     Harmelen, F., and Horrocks, I. Enabling knowledge                   2001. W3C Proposed Recommendation.
     representation on the web by extending RDF Schema. 2000.            http://www.w3.org/TR/xmlschema-1/.
[6] Cui, Z., Tamma, V. and Bellifemine, F. Ontology                 [21] Toivonen, S. Definition and usage of a software agent. In
     Management in Enterprises, British Telecommunications               Arpakannus (2/2000), 9-13, 2000.
     Technology Journal, October, 1999.
[7] Decker, S., van Harmelen, F., Broekstra, J., Erdmann, M.,
     Fensel, D., Horrocks, I., Klein, M., and Melnik, S. The
     Semantic Web - on the Roles of XML and RDF. In IEEE
     Internet Computing. September/October 2000.
[8] Dennett, D.C. When Philosophers Encounter Artificial
     Intelligence. In Daedalus, Proceedings of the American
     Academy of Arts and Sciences, 117, 283-295, 1988.
[9] Fallside, D.C. XML Schema Part 0: Primer. Technical
     Report, W3C, 2001. W3C Proposed Recommendation.
     http://www.w3.org/TR/xmlschema-0/.
[10] FIPA RDF Content Language Specification. 2000.
     http://www.fipa.org/specs/fipa00011/XC00011A.pdf.
[11] Gruber, T.R. Towards Principles for the Design of
     Ontologies Used for Knowledge Sharing. In Nicola Guarino
     & Roberto Poli (eds.): Formal Ontology in Conceptual
     Analysis and Knowledge Representation. Padova, Italy:
     Kluwer Academic Publishers, 1993.
[12] Guarino, N. Some Ontological Principles for Designing
     Upper Level Lexical Resources. Proceedings of First
     International Conference on Language Resources and
     Evaluation, 527-534, Granada, Spain, 1998. ELRA -
     European Language Resources Association.
[13] Heflin, J., and Hendler, J. Semantic Interoperability on the
     Web. In Proceedings of Extreme Markup Languages 2000.