=Paper= {{Paper |id=Vol-1080/owled2013_6 |storemode=property |title=Metamodeling-Based Coherence Checking of OWL Vocabulary Background Models |pdfUrl=https://ceur-ws.org/Vol-1080/owled2013_6.pdf |volume=Vol-1080 |dblpUrl=https://dblp.org/rec/conf/owled/SvatekHKV13 }} ==Metamodeling-Based Coherence Checking of OWL Vocabulary Background Models== https://ceur-ws.org/Vol-1080/owled2013_6.pdf
          Metamodeling-Based Coherence Checking
          of OWL Vocabulary Background Models

        Vojtěch Svátek1 , Martin Homola2 , Ján Kl’uka2 , and Miroslav Vacura1
                           1
                             University of Economics, Prague
                    W. Churchill Sq.4, 130 67 Prague 3, Czech Republic
                              {svatek,vacuram}@vse.cz
                          2
                             Comenius University in Bratislava
                       Mlynská dolina, 842 48 Bratislava, Slovakia
                          {homola,kluka}@fmph.uniba.sk




      Abstract. The surface (or, foreground) structure of linked data and their associ-
      ated OWL vocabularies can be complemented by background models expressing
      valid ontological distinctions that may have become obscured by the modeling
      style chosen by the vocabulary designer. Background models can generally serve
      for debugging, visualization, matching, or even pattern-based design of opera-
      tional ontologies such as linked data vocabularies. An example of a well-known
      background model language, primarily suited for taxonomic ontologies, is the
      system of OntoClean meta-properties. We present an alternative type of back-
      ground model language, dubbed PURO, which is oriented towards linked data on-
      tologies, and relies on particular–universal and relationship–object dichotomies.
      Typical ‘foreground’ manifestations of background language terms are then dis-
      cussed. We demonstrate how a PURO background structure of OWL vocabularies
      can be itself modeled in OWL using a dedicated PURO ontology, and how rea-
      soning over such a metamodel can verify ontological coherence of the original
      OWL vocabularies.

      Keywords: Background model, ontology coherence, OWL.



1   Introduction

The Linked Data (LD) initiative represents one of the first truly functional and pro-
liferating incarnations of the generic concept of the Semantic Web, as a network of
machine-processable data/knowledge. LD vocabularies typically come in the form of
lightweight ontologies expressed in OWL or some of its sub-languages, and their de-
velopment has rarely been guided by rigorous ontology design methodologies; it was
more often a result of a consensus of a group of domain practitioners, perhaps with a
few academicians involved.
     Although we agree that such spontaneous development has its positive side, as it
subscribes to the open nature of the Web environment, the resulting vocabularies may
not always be well-aligned with the exact ontological nature of things that are being
modeled. As an example consider the Music Ontology (MO) [8] which features a prop-
erty mo:primary_instrument3 and introduces individuals such as mo-mit:Cello and mo-
mit:Violin which are expected as its values. Such individuals, however are not respective
to particular physical instruments, but instead refer to instrument types. They are, in
fact, a form of classes modeled by ontology individuals. This does not cause any ap-
parent problems in the MO vocabulary by itself. However, when the vocabulary is then
linked to multitude of data sets by a number of users, it may well happen that one of the
datasets records that, e.g., a particular musician “Yo-Yo Ma” uses the “1712 Davydov
Stradivari” as his primary instrument. An incoherence thus arises between the intended
usage of the mo:primary_instrument property and its actual usage in the data set. Such
an incoherence may later cause problems especially when several data sets are com-
bined, or a data set is transformed from one vocabulary to another.
     We hence feel that some aspects of solid ontological modeling deserve to be ‘in-
jected’ back into linked data vocabularies, as they could have positive effect on subse-
quent handling of such vocabularies as well as underlying data. However, the structure
of popular vocabularies is already followed by masses of data. There might even be
good reasons for this structure being as it is with respect to efficient handling of this
data in data management tools. Hence, direct refactoring of these vocabularies is usu-
ally unfeasible. We thus choose to rely on the annotation paradigm: assigning labels to
the entities from linked data vocabularies that indicate useful ontological distinctions.
     A well-known approach in this respect is OntoClean [3], which allows to assign a
set of predefined meta-properties (e.g., rigidity, anti-rigidity, identity, and unity) to on-
tological classes. The meta-properties are then used in taxonomy refactoring, according
to certain integrity constraints (e.g., a rigid class cannot be subclass of an anti-rigid
class). In the rest of this paper, we denote the syntactical, visible model of a domain (an
OWL vocabulary) as an ontological foreground model (OFM), while the structure of on-
tological distinctions behind that model (such as the associated structure of OntoClean
meta-properties) as the corresponding ontological background model (OBM).
     While OntoClean can be seen as a suitable OBM language for large taxonomy-
oriented ontologies (e.g, such as those found in medical domains), we feel that a differ-
ent approach is needed in case of LD vocabularies. These vocabularies are more fact-
oriented (their taxonomies are often simplistic) and OntoClean labels such as rigidity
or anti-rigidity are not easy to understand by their creators who are often domain prac-
titioners and not ontology experts. In this paper we propose a novel OBM language,
dubbed PURO, whose focus is especially on the distinction between particulars vs. uni-
versals and relationships vs. objects (hence its name). PURO introduces a terminology
of labels such as object, relationship, and type, which should be intuitively understand-
able for users already acquainted with OWL.
     We anticipate several kinds of benefits gained from associating appropriate PURO
OBMs to linked data vocabularies (as OFMs): detection of conceptual incoherence
within individual vocabularies; more principled design of new vocabularies; faster adop-
tion of existing vocabularies thanks to insight into their inherent structure (otherwise
obscured by syntactical details of the OFMs); and prevention of mismatches when us-
ing multiple vocabularies in combination.
 3
     mo=http://purl.org/ontology/mo/
     mo-mit=http://purl.org/ontology/mo/mit#
                                 Object Relationship Valuation
                      Particular B-object B-relationship B-valuation
                                 (individual) (3 options)   (data prop. assert.)
                      Universal B-type       B-relation     B-attribute
                                 (class)     (3 options)    (data property)

      Fig. 1. Basic terms of the PURO OBML (and their corresponding foreground notions)


    The contribution of the paper is as follows: Section 2 introduces the PURO OBM
language. Section 3 concentrates on checking coherence of OFMs w.r.t. a PURO OBM.
PURO coherence constraints are defined here and the PURO model is formalized as an
OWL ontology (a meta-ontology, in fact). The ontology can be associated with an OFM
through annotation of OFM entities with a distinguished set of labels. The coherence
constraints are formalized as OWL axioms. A metamodeling-based coherence checking
procedure is then shown including a short example. Assumptions of the approach and its
overall feasibility are discussed in Sect. 4. A short related work survey and concluding
remarks follow in Sects. 5 and 6.


2     The Language of PURO Ontological Background Models

A PURO ontological background model makes two basic distinctions: the one between
ontological particulars and universals, and the one between objects and relationships.
Particular and universals are distinguished by the possibility of instantiation: A partic-
ular cannot have instances, while a universal possibly can. As for the R-O distinction,
objects are singular entities with their own identity, while a relationship cannot be con-
sidered (and talked about) without also considering the entities on which it depends.
LD vocabularies prominently feature assignments of (quantitative) data values to indi-
viduals. Such assignments are not proper relationships, and thus we distinguish them as
a third option within the R-O distinction: valuation.
    The P-U and R-O(-V) distinctions are orthogonal, thus creating a two-dimensional
space of the size 2 × 3. The language of PURO ontology background models (PURO
OBML) contains six basic terms (primitives) in each point of this space (cf. Fig. 1).
Each of the PURO terms can be associated with an OFM entity in order to clarify its
ontological background.


2.1   Basic Terms of the PURO OBML

We now describe PURO terms in detail. To avoid confusion with OFM terms, all OBM
terms are prefixed with B- and we intentionally use different words (object, type, rela-
tion, etc.) than OWL (which uses individual, class, property, etc., respectively). We start
with the P-U distinction applied to objects:
    B-object refers to a particular object, typically a real-world one, which can be tan-
gible (such as people, animals, things, etc.) or intangible (such as topics, events, pro-
cesses, abstract collections of information, etc.). It is analogous to the notion of indi-
vidual in the foreground model.
    mo:Instrument                  mo:MusicArtist         mo:Performance         mo:Release      mo:ReleaseType

       rdf:type      rdf:type         rdf:type                    mo:Signal         rdf:type
                                                                                           mo:release_ rdf:type
                                                                                              type
    mo-mit:Violins                  ex:YoYo_Ma rdf:type                       ex:CBS_D3_37867       mo:album
                                                             rdf:type
    skos:narrower                 mo:performer                 mo:published_as  frbr:exemplar_of
                                                 mo:record-                            rdf:type
                        mo:primary_                 ed_as
    mo-mit:Cello         instrument    ex:pSCS83           ex:sSCS83      ex:myLP0047          mo:MusicalItem
                                     mo:performance_of
                        rdf:type                                            mo:composer rdf:type
                                                         ex:composition_
   mo:MusicalWork                   ex:SixCelloSuites                           ex:JSBach        mo:Composer
                                                               of_SCS
                                        mo:produced_work
                                                       event:time event:place
                                                    ex:int1717–1723     ex:Köthen


                                   Fig. 2. Music Ontology data about myLP0047


    B-type refers to a universal, a set of background entities instantiating the same con-
cept or sharing a common property. For simplicity, it also covers the notion of quality
(e.g., red color), which is ontologically slightly different but plays the same role in the
model structure. B-types generalize the foreground notion of classes. A B-type is ho-
mogeneous in the sense that its instances are PURO model entities of the same kind.
Homogeneity leads to stratification of B-types: instances of 1st -order B-types are B-
objects, instances of nth -order B-types are (n − 1)th -order B-types for n > 1. Thus,
unlike most description logics, the PURO admits also higher-order entities.
Example 1. Our music collection includes an LP from the CBS 1983 release of the
same-year recording of YoYo Ma’s performance of J. S. Bach’s ‘Six Cello Suites’. A
possible Music Ontology foreground model of this situation is depicted in Fig. 2.
The foreground individual ex:myLP0047 is an instance of the class mo:MusicalItem. In
the PURO background model of MO, ex:myLP0047 is a B-object, and mo:MusicalItem
is a 1st -order B-type. The individual ex:CBS_D3_37867 is an instance of mo:Release.
As such, it describes common properties of all physical LPs, mo:MusicalItems, from the
release. Furthermore, this particular release is assigned a mo:ReleaseType of mo:album,
which is a foreground individual. Thus, mo:MusicalItem and ex:CBS _ D3 _ 37867 are
both 1st -order B-types with different foreground representations; similarly, mo:Release
and mo:album are 2nd -order B-types; and mo:ReleaseType is a 3rd -order B-type.
    Let us now apply the P-U distinction to relationships:
    B-relationship refers to a particular relationship between two or more background
entities (e.g., an object is produced by some producer; an object is of a certain type, one
type of goods is a special case of another; or a vendor exclusively supplies a customer
with some type of goods). B-relationship generalizes several foreground notions: object
property assertion, instantiation, and inter-class axioms. B-relationships can also be n-
ary for n > 2. We further discuss various kinds of B-relationships below in Section 2.2.
    B-relation refers to a conceptual relation, the universal counterpart of B-relation-
ship. It is analogous to the foreground notion of object property without limiting the
arity to two, and the domain and range(s) to B-objects. Set-theoretically, a B-relation is
a subset of a Cartesian product of two or more sets of entities. Each of these sets must
be homogeneous just like a B-type.
    Finally, applying the P-U distinction to valuations yields two more terms:
    B-valuation refers to a particular assignment of a quantitative data value to an entity
(most often, but not exclusively, to a B-object). B-valuation is similar to B-relationship,
but it has a data value on the right hand side (e.g., the duration of a recorded signal is
7806 sec). B-valuation is analogous to the foreground notion of data property assertion.
    B-attribute refers to a universal consisting of valuations of the same quantitative
property. The analogous notion in the foreground model is data property.
    Data values in a PURO background model are analogous to data values in fore-
ground models. However, we only consider quantitative data values in the background
model. Qualitative foreground data values may usually be reduced to B-types as hinted
above, e.g., a B-type of B-objects having red color.

2.2     Relationships in PURO OBMs
B-relationships are the least uniform kind of entities in PURO background models, and
require further discussion. We distinguish the following kinds of B-relationships:
    B-instantiation is a relationship between an entity and a B-type that the entity
instantiates. A B-instantiation intuitively means that the entity belongs to the given
type, and it is a background analogue of an rdf:type statement. Unlike in foreground
models either a B-object, a B-type, or sometimes even a B-relationship can appear as
the entity on the left-hand side. In the latter case, the right-hand-side entity is a higher-
order B-type.
    A canonical foreground manifestation of B-instantiation is an rdf:type statement.
When a B-type is represented by a foreground individual (e.g., ex:CBS _ D3 _ 37867
and mo:album in Fig. 2), B-instantiation into this B-type is manifested as an assertion
of an object property (e.g., frbr:exemplar _ of and mo:release _ type). We remark that
while in the former case the OFM and the OBM are analogous, in the latter case a
clear dichotomy between the two models is apparent. Finally, some LD vocabularies
allow, perhaps for increased flexibility, to assign categories to various objects via data
properties with string or literal ranges (e.g., gr:category4 in the GoodRelations (GR)
vocabulary [4]). From the point of view of the PURO model, categories are B-types,
and so, in this case, B-instantiation is manifested as a data property assertion.
    B-axiom is a B-relationship between two B-types, expressing a set-theoretic re-
lationship between their extensions (such as subsumption or disjointness). A B-axiom
is canonically manifested in the foreground model using the corresponding foreground
axiom, e.g., class subsumption expressed by an rdfs:subclassOf statement between two
classes, each corresponding to a B-type. If B-types are represented as foreground indi-
viduals, their mutual relationship can be expressed using object properties. An example
is found in the MO-recommended SKOS5 taxonomy of instruments. Its fragment is
shown in the left part of Fig. 2. The terms of this taxonomy represent B-types of musi-
cal instruments. Hence, in this case,6 the skos:narrower property is a manifestation of a
subsumption B-axiom.
 4
   gr=http://purl.org/goodrelations/v1#
 5
    skos=http://www.w3.org/2008/05/skos#
 6
    Note that skos:narrower and skos:broader properties do not necessarily correspond to class
     subsumption since their meaning, as defined in the SKOS vocabulary, is slightly broader. How-
     B-fact is a B-relationship that cannot be classified as either a B-instantiation or a
B-axiom. One participant in a B-fact is typically a B-object. If the other is a B-type, the
B-fact is called heterogeneous. B-facts are canonically manifested in the foreground
model as object property assertions. In Fig. 2, ex:pSCS83 mo:performer ex:YoYo _ Ma
represents a homogeneous B-fact, while assertions ex:YoYo_Ma mo:primary_instrument
mo-mit:Cello, and ex:sSCS83 mo:published_as ex:CBS_D3_37867 represent heteroge-
neous B-facts. In some cases, B-objects are represented as data values (e.g., regions
can be encoded as strings ”DE”, ”US-CA”, etc.). A B-fact involving such an object is
manifested as an assertion of a data property (e.g., gr:eligibleRegions).
     The most complex manifestation of a B-fact is the reified relationship pattern. Here,
a foreground individual reifies the B-fact, and it is connected to the participants in the
B-fact by additional (object or data) property assertions. Reification is the only way of
expressing n-ary relationships in LD vocabularies, since OWL does not support native
n-ary constructs [12,6]. Figure 2 depicts an mo:Composition individual ex:composition_
of _ SCS, which can be seen as a reified B-fact among a composer, his musical work,
place, and time of composition. It should be noted that the boundary between (complex)
relationships and (intangible) objects may not be sharp in some situations. For instance,
a composition relationship can be also viewed as a composition event. Indeed, mo:
Composition is a subclass of event:Event.7 We discuss this issue in Section 4 below.
     Due to the focus of this paper and space constraints, we have only described the
various kinds of B-relationships briefly. A detailed discussion with examples from LD
vocabularies is available in [10].


3     PURO Model Coherence Checking

Clearly, the combination of OBM distinctions underlying a particular OFM cannot be
arbitrary. For example, as mentioned in the introduction, if a class is labelled as rigid
according to OntoClean, it cannot be subclass of a class labelled as anti-rigid. Violating
such assumptions can be viewed as OBM-level incoherence. Note that the notion of
OBM incoherence is only very loosely associated with that of logical inconsistency in
principle. Yet, as we demonstrate later, the task of OBM coherence checking can be
transformed into DL consistency checking, through metamodeling.
    We will now define when a PURO OMB is coherent (Sect. 3.1), and formalize the
PURO language and OBM coherence constraints in an OWL DL ontology (Sect. 3.2).
That will enable us to define and demonstrate a method of automatic background model
coherence checking of OFMs annotated with PURO OBML constructs (Sects. 3.3, 3.4).


3.1   Background Model Coherence

A PURO OBM model is said to be coherent if it satisfies all of the following three
constraints:
   ever in some LD vocabularies (such as mo-mit) which contain classes reifed as objects they are
   used to indicate subsumption.
 7
   event=http://purl.org/NET/c4dm/event.owl#
Background model module              Foreground model module            Constraints module
Class hierarchy                      Class hierarchy                    Axioms
      B-entity                             F-entity                          F-instanceOf v B-instanceOf       (FBi)
        w B-particular                       w F-individual                  F-subclassOf v B-subtypeOf        (FBs)
          w B-object w · · ·                 w F-class                       B-type1 v ∀B-instanceOf .B-object (Hi1)
                                                                                                     −

          w B-relationship w · · ·           w F-obj-prop-assertion          B-type1 v ∀B-subtypeOf.B-type1 (Hs1)
          w B-valuation                      w F-obj-prop                    B-type1 v ∀B-subtypeOf− .B-type1 (Hs− 1)
        w B-universal                        w F-data-prop-assertion         B-type2 v ∀B-instanceOf− .B-type1 (Hi2)
          w B-type                           w F-data-prop                   B-type2 v ∀B-subtypeOf.B-type2 (Hs2)
              w B-type1              Properties                              B-type2 v ∀B-subtypeOf− .B-type2 (Hs− 2)
              w B-h-o-type                F-subclassOf, F-instanceOf,        ···
                w B-type2                 F-domain, F-range, . . .
                w ···
           w B-relation         Labels module
             w B-fact           Class
             w B-axiom w · · ·        Label ≡ {CO, COi, CT, CTO, . . .}
             w B-instantiation  Property
           w B-attribute              hasLabel (domain: F-entity, range: Label)
Properties                      Axioms
     B-subtypeOf, B-instanceOf,       ∃hasLabel.{CO} v F-class u B-type1
     B-domain, B-range, . . .         ∃hasLabel.{CT} v F-class u B-h-o-type
                                      ∃hasLabel.{CTO} v F-class u B-type2
                                      ···

                             Fig. 3. Fragments of the PURO ontology modules.


entity coherence: the sets of B-objects, B-valuations, B-relationships, nth -order B-
    types for each n ≥ 1, B-relations, and B-attributes are pairwise disjoint;
type homogeneity: each B-type is a homogeneous nth -order B-type, for some n ≥ 1,
    as defined in Sect. 2.1;
relation homogeneity: the domain (the range) of each B-relation is a homogeneous
    nth -order (mth -order) B-type for some n ≥ 1 (m ≥ 1).

3.2    PURO Ontology
The language of PURO background models can be expressed as an OWL DL ontol-
ogy. As this ontology talks about entities from different ontologies (i.e., the OFMs) it
is in fact a meta-ontology. Within it, we can formally postulate coherence constraints
introduced in the previous section. The main advantage of the PURO ontology lies
in automation of incoherence detection: The ontology can be populated with entities
resulting from metamodeling of some LD vocabulary and with annotations of those
entities with PURO constructs. We are then able to detect incoherences in the vocabu-
lary at the PURO background level using a regular ontology reasoner. The process of
coherence checking is described in Sect. 3.3.
    The ontology consists of four partially dependent modules: 1) background model,
2) foreground model, 3) labels, 4) constraints. Fragments of all modules are shown in
Fig. 3 (we rely on the description logic syntax in order to improve readability).
    The background model module defines the hierarchy of classes and properties rep-
resenting the terms of the PURO OBML (e.g., B-object, B-type, etc.). Similarly, the
foreground model module defines a much simpler hierarchy of concepts and properties
representing the language terms of foreground models (e.g., F-individual, F-class, etc.).
The labels module defines a set of annotation labels, represented as individuals grouped
in the Label class. The labels have acronyms consisting of an initial letter denoting the
syntactic entity that is annotated (C for class, I for individual, Pr for property range, Pd
for property domain) and of a sequence of letters determining the actual PURO concept.
    Frequently used labels for classes thus are CO (Class whose instances correspond
to B-Objects), CR (Class whose instances correspond to B-Relationships), CV (Class
whose instances correspond to B-Valuations), or CT (Class whose instances correspond
to B-Types), further refined to CTO (Class whose instances correspond to B-Types of
Objects). Analogously, property ranges can be labeled by PrO, PrR, PrV or PrT, and
individuals by IO, IR, IV or IT.8 The labels module also expresses the semantics of the
labels in terms of axioms on top of the languages of background and foreground models.
For instance, an entity annotated with the label CTO is a foreground class, and a B-type
of B-types of B-objects, i.e., a 2nd -order B-type.
    Finally the constraints module expresses conditions that hold in a background model.
For instance, axioms (Hi1, Hs1, Hs− 1, . . . ) assert that B-types are homogeneous. This
module also connects properties conjoint in the background and the foreground models
(FBi, FBs), such as instantiation or subsumption.


3.3     Vocabulary Meta-Modeling and Coherence Checking

We would like to be able to automatically check the background coherence of LD vo-
cabularies, once they have been annotated by the labels respective to the PURO OBM.
Welty [13] and Glimm et al. [2] rely on metamodeling and ontology reasoners for ver-
ification of OntoClean constraints. With use of the PURO ontology introduced in the
previous section we are able to take a similar approach.
     The inputs of coherence checking are an LD vocabulary and its annotation. This
input needs to be pre-processed before checking as follows:

 1. Obtain a deductive closure T of the checked vocabulary.
 2. Create a metamodel T 0 of the closure T within the foreground model module, as
    follows: Let T C , T P , T I be the sets of classes, properties, and individuals of T re-
    spectively. The metamodel contains individuals cC , pP , and i j (as ‘proxies’ of the re-
    spective vocabulary entities), together with assertions F-class(cC ), F-obj-prop(pP ),
    and F-individual(i j ), for each C ∈ T C , P ∈ T P , and j ∈ T I respectively.
    Each assertion C( j) in T is then metamodeled as F-instanceOf(i j , cC ) in T 0 , and
    each subsumption C v D as F-subclassOf(cC , cD ). Property domains, ranges, sub-
    property axioms, and property assertions can be metamodeled similarly.
 3. Assert annotation labels in T 0 explicitly using the labels module: If an entity E
    of T is annotated with a label L, then we have hasLabel(e, L) in T 0 where e is the
    individual metamodeling E.

    Coherence of the original vocabulary can now be checked by reasoning over the
ontology obtained as the union of the metamodel T 0 and the four modules of the PURO
ontology. The fragment of the PURO ontology shown in Fig. 3 enables detection of the
following incoherences:
 8
     In addition to labels corresponding to PURO concepts, there are also labels that indicate fore-
     ground data without ontological background relevant for a PURO model. We omit them here
     as they are not relevant for coherence checking; more information is in [10].
 – If a foreground entity E (typically an individual) metamodeled as e is directly or
   indirectly labeled as representing both an ontological particular and a universal,
   then we recognize this incoherence due to membership of e in the meta-class:

                          Incoherent-PU ≡ B-universal u B-particular .

 – If a foreground class C is labelled as having both B-objects (CO) and 1st -order B-
   types (CTO) as instances, it cannot represent a proper homogeneous B-type. This
   kind of incoherence can be detected due to axioms (Hi1) and (Hi2) implying

                           (∀B-instanceOf− .B-object u B-type1)(cC ) .

      In fact, this incoherence is a special case of a more general situation, in which
      a class (or another foreground entity) is labelled as having both particulars and
      universals as its instances. Such an entity (e.g., cC ) falls into the meta-class:

            Incoherent-T-PU ≡ B-type u ∀B-instanceOf− .B-particular u B-universal
                            ≡ B-type u ∀B-instanceOf− .Incoherent-PU .

      Note that axioms (Hs1, Hs− 1, Hs2, Hs− 2) propagate the level of B-types over
      subsumption. If a class has two subclasses with instances of different background
      kinds, it will be recognized as incoherent.
More kinds of incoherence (e.g., involving higher-order B-types, or ranges of prop-
erties) can be detected with a suitable combination of constraints and definitions of
meta-classes of incoherent entities.
    Our approach not only detects that a foreground vocabulary is incoherent, but also
allows to explain which kind of incoherence was detected, and which foreground model
entities caused it, since they are instances of the respective Incoherent- meta-class.

3.4    Coherence Checking Demonstration
Let us now demonstrate coherence checking on the GR vocabulary. The vocabulary
contains a class gr:ProductOrService with three subclasses: gr:Individual, gr:Produc-
tOrServiceModel, gr:SomeItems. The subclass gr:Individual is easily recognized as a
class of B-objects and thus annotated as CO. The subclass gr:ProductOrServiceModel
represents models, i.e., types of products, hence its annotation label is CTO.
    If we feed this fragment of the GR vocabulary and the annotation into the coher-
ence checking mechanism from Sect. 3.3, we obtain the following results: From the
annotation of gr:Individual and gr:ProductOrServiceModel, the labels module allows us
to derive B-type1(cgr:Individual ) and B-type2(cgr:ProductOrServiceModel ). Constraints module ax-
ioms (Hs1, Hs2) propagate the background meta-classification to the common super-
class, and so we obtain B-type1(cgr:ProductOrService ) and B-type2(cgr:ProductOrService ). Since
instances of a B-type1 are particulars (B-objects), and instances of a B-type2 are uni-
versals (1st -order B-types), the cgr:ProductOrService is an instance of Incoherent-T-PU.
    The version of PURO ontology and the GR fragment used in the demonstration
is available from http://patomat.vse.cz/puro_v1.owl and http://patomat.
vse.cz/gr_mm.owl, respectively.
4   Discussion

The whole approach has several critical assumptions: 1) given an OFM properly an-
notated with PURO labels, incoherence can be detected via reasoning; 2) PURO OBM
entities can be identified based on the structure and documentation of an OFM with
reasonable accuracy; 3) the structure of the OFM and the respective PURO OBM sig-
nificantly differs in a number of OWL vocabularies.
    In the previous section, we have demonstrated the validity of the first assumption,
though on a tiny example only. We however do not expect computational complexity
issues even when processing complete LD vocabularies since mostly their size is rather
small. As only the vocabulary needs to be annotated, the underlying datasets which are
often large do not impact the effectivity of coherence checking.
    To verify the remaining two assumptions we have surveyed a number of LD vo-
cabularies (in business, government, geography, and other domains) and conducted an
annotation experiment. We summarize the results as follows (for details see our tech-
nical report [10]): In 92 out of 94 syntactical classes in 3 popular vocabularies, two
annotators (both expert ontologists) found a clear consensus on the PURO OBM cat-
egory. Out of these 94 classes, only 59 were labeled as CO (i.e., matching the OFM
structure).
    We found that, in large majority, LD vocabularies anticipate facts about particu-
lars (e.g., real persons, organizations, items of goods, documents, etc.) which can be
reliably distinguished from the universals in the domain, whatever their syntactical
way of modeling is. However, this distinction may be blurred in biomedical and other
scientifically-biased ontologies, which often feature individuals acting as prototypes
(e.g., cells, organs, chemical processes, etc.). As PURO model is specifically designed
for LD vocabularies, the latter type of ontologies is not its primary application target.
    The boundary between relationships and objects may not be as sharp in some situa-
tions. Due to absence of n-ary relations, LD vocabularies often feature reifying individ-
uals, e.g., the instances of mo:Composition in MO (relation between musical work, its
composer, place, and time) or gr:Offering in GR (relation between a seller, offered items,
eligible regions, etc.). Such objects can be viewed as B-relationships but often also as
regular B-objects – the event of composition, an abstract information record. The dis-
tinction is often a modeling decision (even at the background level) and it should be
based on the criterion whether the (reifying) object would be meaningful even without
explicitly considering the other participants in the relationship. In these terms, a com-
position event is clearly incomplete without knowing what was composed, similarly a
business offering is incomplete without knowing the seller or the items offered. More
detailed analysis of ‘R vs. O’ nuances is ongoing work.


5   Related Research

A prominent example of a different OBML (and in fact the only one known to us) is cer-
tainly OntoClean [3], which differs from our approach in its focus on annotating classes
and repairing their taxonomic relationships. In contrast, PURO focuses on instantiations
and facts, and its purpose is not necessarily repair. OntoClean also relies on conceptual
notions (such as rigidity) that are fundamentally different from what LD practitioners
typically know from the foreground representation; this also holds for works that map
ontological roles into OWL [9]. PURO, on the other hand, introduces new primitives
which are more familiar (e.g., object, relationship, type, etc.).
    Checking OntoClean constraints and similarly PURO coherence involves reasoning
with higher orders. Though allowed to a certain extent in OWL Full, it is however not
feasible in practice [5]. Motik [5] and De Giacomo et al. [1] proposed alternate seman-
tics for higher-order description logics while maintaining decidability. Similarly Pan
and Horrocks [7] proposed a variant of RDFS semantics (dubbed RDFS(FA)), in which
types are divided into multiple ‘strata’, analogous to type orders in the PURO model.
    Welty [13] and Glimm et al. [2] use metamodeling inside OWL in order to automat-
ically verify OntoClean constraints. We have taken a similar approach in Sect. 3, where
we have metamodeled the foreground ontology and expressed constraints that are nec-
essary to check the PURO OBM model coherence, which can then be done using a
regular OWL reasoner. Our approach is closer to the one of Welty, as we require a pre-
processing step, in which the deductive closure of the foreground ontology is computed
before the meta-level reasoning is applied. We are currently investigating the possibility
of performing both levels of reasoning at the same time, similarly to Glimm et al.
    The ontological distinctions recognized by PURO, albeit being rather simplistic, are
also found in many foundational ontologies (e.g., DOLCE9 or BFO10 ). Foundational
ontologies are intended to be directly combined with ontologies at the same level of
reasoning. The PURO model (and the whole OBML paradigm) significantly differs in
that it also (largely) focuses on use cases in which we do not intend to restructure the
associated OFM models, but we simply want to make the ontological background in
them obvious so that their understanding and manipulation is facilitated. Therefore,
the coupling methods between OFMs and OBMs differ from the ones of foundational
ontologies (e.g., annotation and metamodeling, not direct ‘injection’ into the OFMs).


6     Conclusions and Future Work
PURO is a new OBM language aimed to aid in the process of development, debug-
ging, visualization, and matching of operational ontological models. PURO is specifi-
cally aligned with LD vocabularies, focusing especially on the particular–universal and
relationship–object distinctions, relying on constructs that should be rather familiar to
users acquainted with basic OWL terminology. A pairing of a vocabulary and a PURO
OBM is established using annotation with a predefined set of labels. The language also
features a set of constraints, allowing for coherence checking w.r.t. the PURO model
associated via annotation. With help of metamodeling an annotated LD vocabulary can
be formally paired with the PURO meta-ontology and the coherence check can then be
automated relying on available OWL reasoners in terms of a consistency check of the
resulting ontology.
    We see two main advantages of the outlined approach: It allows ontology developers
1) to verify the level of ontological relevance of their models, and 2) to annotate their
 9
     http://www.loa.istc.cnr.it/DOLCE.html
10
     http://www.ifomis.org/bfo
vocabularies with ontological distinctions inexpressible in OWL, thus communicating
intentions behind their modeling decisions. Although this paper has mainly focused on
the former of these advantages, also the latter is relevant. It is not always practical,
or even possible, to develop ontologically pure LD vocabularies, as there are justified
design decisions and language constraints leading to simplified models. It is thus use-
ful to keep track of the resulting ontologically ill-aligned model features (e.g., classes
modeled as individuals, instantiations modeled as object properties, etc.), since they
can gradually lead to incoherence, especially when a vocabulary is paired with various
different data sets, which may each follow a slightly different modeling approach. The
PURO models may turn into a useful tool to guide this process. In order to further aid
its applicability, we are currently developing an annotation tool integrated with Protegé.
This tool will allow for easy annotation of vocabularies, and also of data sets using a
vocabulary, and automated coherence checking (see [10]).

Acknowledgements. This work was supported from the EU ICT FP7 under no. 257943
(LOD2 project), from the Slovak VEGA project no. 1/1333/12, and from the Czecho-
Slovak bilateral project nos. 7AMB12SK020 (AIP) and SK-CZ-0208-11 (APVV).


References
 1. De Giacomo, G., Lenzerini, M., Rosati, R.: On higher-order description logics. In: Proc.
    DL 2009.
 2. Glimm, B., Rudolph, S., Völker, J.: Integrated metamodeling and diagnosis in OWL 2. In:
    Proc. ISWC 2010.
 3. Guarino, N., Welty, C.: An Overview of OntoClean. In: Staab, S., Studer, R., eds.: The Hand-
    book on Ontologies, pp. 151–172, Springer-Verlag, 2009.
 4. Hepp, M. GoodRelations Language Reference. Version 1.0, released Oct 1, 2011. Available
    onilne: http://purl.org/goodrelations/v1
 5. Motik, B.: On the properties of metamodeling in OWL. J. Log. Comput. 17(4):617–637,
    2007.
 6. Noy, N., Rector, A. (eds.): Defining N-ary Relations on the Semantic Web. W3C Working
    Group Note 12 April 2006, online at http://www.w3.org/TR/swbp-n-aryRelations/.
 7. Pan, J.Z., Horrocks, I.: RDFS(FA): Connecting RDF(S) and OWL DL. IEEE Trans. Knowl.
    Data Eng. 19(2):192–206, 2007.
 8. Raimond, Y., Giasson, F. (eds.): Music Ontology Specification. Released Nov 28, 2010.
    Available online: http://purl.org/ontology/mo/.
 9. Sunagawa E., Kozaki K., Kitamura Y., Mizoguchi R.: Role organization model in Hozo. In:
    Proc. EKAW 2006.
10. Svátek, V., Homola, M., Kl’uka, J., Vacura, M.: Ontological Distinctions for Linked Data
    Vocabularies. Technical Report TR-2013-039. Comenius University, Bratislava, 2013. Avail-
    able online: http://kedrigern.dcs.fmph.uniba.sk/reports/display.php?id=54
11. Svátek, V., Homola, M., Kl’uka, J., Vacura, M.: Mapping Structural Design Patterns in OWL
    to Ontological Background Models. In: Proc. K-CAP 2013.
12. W3C OWL Working Group (eds.): OWL 2 Web Ontology Language Document Overview.
    W3C Recommendation, 2009.
13. Welty, C.: OntOWLClean: Cleaning OWL ontologies with OWL. In: Proc. FOIS 2006.