<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Easy Keys for OWL</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bijan Parsia</string-name>
          <email>bparsia@cs.man.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ulrike Sattler</string-name>
          <email>sattler@cs.man.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Schneider</string-name>
          <email>schneider@cs.man.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science, University of Manchester</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>One of the commonly requested features for OWL is some form of key support, generally phrased as allowing inverse-functional datatype properties. For a variety of technical reasons, these were not included in OWL DL (though they were available in OWL Full). OWL 2-a revision to OWL which is under development by the W3C-introduces a form of key representation, so-called “Easy Keys”, which avoids much of the implementational pain of general inverse-functional datatype properties, “dumbs down” nicely to weaker languages (like RDFS), and can be specified (or even implemented) in terms of DL-safe rules. The Easy Key approach also allows for a form of compound key. In this paper we explicate the design and rationale of Easy Keys, sketch a decision procedure for them, and compare them with alternative possibilities, including general inverse-functional datatype properties.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Identification constraints, keys or inverse-functional datatype properties, are all
ways of expressing, by means of a description, that two or more individuals
are necessarily identical. These constraints differ in their expressive power and
their “behavior”, e.g., whether their violation leads to an error, an
inconsistency, or the “quiet” identification of the individuals in question. Regardless of
these details, identification constraints are of vital importance to many
applications. Declaring, for example, the datatype property hasSSN as inverse-functional
would mean that any two individuals sharing the hasSSN same value are inferred
to be identical. For OWL, key reasoning in general can be unfeasibly difficult
[
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1–3</xref>
        ] and, to the best of our knowledge, no existing OWL DL reasoner handles
inverse-functional datatype properties in a complete way. Consider the ontology
O from Figure 1 about records of persons in a certain country’s tax office.
1 ObjectProperty: hasParent
2 Class: Person hasKey hasSSN
      </p>
      <p>SubClassOf: hasParent some Person, hasSSN some int[&gt; 41, &lt; 43]
3 Individual: Mary</p>
      <p>Types: Person, inv(hasParent) only owl:Nothing
4 Individual: Sheevah
– Axiom 1 expresses a strict single-child policy of the country in question.
(This might seem a bit artificial, but for the explanations below we need a
property that naturally creates long chains and can be inverse-functional.)
– Axiom 2 states that the datatype property hasSSN (which stands for “social
security number”) is a key for the class Person, and that every person has a
parent and an SSN that is an integer between 41 and 43. In particular, each
instance of Person sits in an infinite hasParent chain, possibly looping.
– Axioms 3 and 4 say that Mary and Sheevah are instances of Person, and
Mary is childless—i.e., sits at the beginning of such an infinite chain.</p>
      <p>Suppose now that the key constraint in Axiom 2 is interpreted by stating
that hasSSN is inverse-functional if restricted to Person:
(R1) For all individuals x, y of type Person, if x and y agree on their
hasSSN value, then x = y. Under this reading, O has no models: Axiom 3
says that Mary is a Person. According to Axiom 2, she has a parent x who is
a Person. Axiom 2 then implies that both have SSN 42. Since SSN is a key for
Person, Mary and x coincide. Hence, Mary is her own parent, which contradicts
the last part of Axiom 3.</p>
      <p>An analogous contradiction would occur if we replaced the range of hasSSN
with “int[&gt; 0, &lt; 1, 000, 000]”. Then, in order to show unsatisfiability of O, we
would need to create 1, 000, 000 ancestors of Mary’s before we could reason that
two of them collapse and that this would induce a cycle in the hasParent chain
and therefore contradict Axiom 1.</p>
      <p>
        Hence, if interpreted as a general inverse-functional datatype property, a key
constraint can require very large, yet finite, interpretations and makes reasoning
computationally difficult. We believe that this expressivity is not required by
applications and is too difficult for users to make effective use of. Instead of
tackling the general problem of reasoning with keys in OWL, we propose a
restricted version of keys—EasyKeys—which meets important use cases and is
practical to implement. This is achieved by incorporating DL-safety [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] into (R1):
(R2) For all named individuals x, y of type Person, if x and y agree on
their named hasSSN-values, then x = y. Both occurrences of “named” in
this condition restrict the key constraint to explicit data: key reasoning is only
triggered for named individuals and values (i.e., those occurring in the ontology)
and not for generated ones (i.e., those generated by the reasoner). Under this
reading, O now has models, for instance the one consisting of an infinite chain
of instances of Person, headed by Mary, plus a separate individual Sheevah.
Each instance of Person has a hasSSN-successor of value 42, although this is not
explicit in O for any of the named individuals. Hence O is consistent, but does
not entail any equality between named individuals. However, there is another
model of O where Mary and Sheevah coincide.
      </p>
      <p>The missing entailment is already problematic, but—even worse—it can be
restored by adding a completely unrelated fact, see Section 4.1. Therefore, we need
to relax DL-safety and obtain:
(R3) For all named individuals x, y of type Person, if x and y agree
on their hasSSN-values, then x = y. The omission of the second occurrence
of “named” in this condition restricts the models of O to those where Mary and
Sheevah coincide. Hence, O is still consistent, but now entails Mary = Sheevah.
This reading seems the most natural for O, and we claim that the relaxation of
DL-safety does not lead to computational difficulties. We will base EasyKeys’
semantics on reading (R3) and discuss this choice in more detail in Section 4.
Since we are designing a feature of OWL 2, we want EasyKeys to fit in with
other Semantic Web languages such as the various OWL 2 Profiles1 and even
RDF(S) (perhaps extended with Datalog like rules). We started with the
following desiderata in mind when designing EasyKeys:
(D1) Be useful enough to be worth specifying.
(D2) Have a low impact on implementations, at least in common cases.
(D3) Be easy to specify clearly and standardize.</p>
      <p>
        In our experience (and in the experience of users we interviewed), applications
merely require that key constraints are checked on explicit data, see (R2) and
(R3). This alone is a big and effective reduction in expressivity since it allows us
to use a variant of DL-safe SWRL rules [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to specify (and even implement) key
semantics. The DL-safety condition (roughly, that variables in rule atoms range
only over individuals named in the ontology) is sufficient to render SWRL rules
decidable and hopefully feasible in practice. Furthermore, DL-safe rules degrade
gracefully as we step down the overall expressiveness of the language, making
EasyKeys intuitively feasible for RDF(S) systems with Datalog rule support.
These facts strongly suggest that key constraints with semantics based on (R2) or
(R3) squarely hit the application, implementation, and specification desiderata.
      </p>
      <p>In this paper, we will discuss the intuitions for EasyKeys and their desired
properties. We will define a suitable notion of key constraints and explain how to
extend standard tableau algorithms to incorporate them. Finally, we will discuss
design alternatives for OWL based keys.</p>
      <p>
        Related Work. In the literature, key constraints and, more general,
functional dependencies have been added to description logics of different
expressivity, largely being interpreted according to reading (R1). This was unproblematic
in the absence of datatypes, where complexity of reasoning was hardly affected
[
        <xref ref-type="bibr" rid="ref10 ref5 ref6 ref7 ref8 ref9">5–10</xref>
        ]. But it led to undecidability in the presence of datatypes even for the
description logic ALC [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],2 which is far less expressive than the designated OWL-2
DL SROIQ [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The latter result suggests to carefully balance the amount of
expressivity added to OWL by key constraints. We are confident that EasyKeys
meet this requirement.
      </p>
      <sec id="sec-1-1">
        <title>1 http://www.w3.org/TR/owl2-profiles/</title>
        <p>2 In fact, this result was proven for ALC extended by a very simple datatype D that
has an efficiently decidable satisfiability problem.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>EasyKey Intuitions</title>
      <p>Keys in general have (or may have) the following properties.
(P1) If two individuals x and y have the same key values, then x = y.
(P2) Integrity constraint: missing key values raise an error. Individuals which
may have a key must have a key. This is easily seen in the case of relational
database rows.
(P3) Functionality constraint: entities have only one key. This can be
interpreted weakly (in any model, a given entity has at most one key) or strictly
(each entity has a known key, the same one in every model).</p>
      <p>
        Property (P1) is essential—it states the definition of a key. Property (P2) is
not expressible directly in first-order logic (being non-monotonic), thus in
neither OWL nor OWL plus DL-safe rules. There are proposals for adding
nonmonotonic features to a language like OWL (e.g., [
        <xref ref-type="bibr" rid="ref12 ref13 ref14">12–14</xref>
        ]) and even one that
is an extension of DL-safe rules [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. However, none of these OWL-tailored
approaches have gotten significant traction with implementors or users. This alone
makes adopting a solution to (P2) as part of EasyKeys a violation of
Desideratum (D3). Not only would we have to build momentum for one of the alternative
mechanisms, but we would have to consider the effect of those mechanisms on a
wide range of problems beyond missing keys. So we forgo (P2) for the moment.
      </p>
      <p>Property (P3) in its weak sense (in any model, a given entity has at most one
key) can be expressed in OWL as a maxCardinality 1 statement, and this also
works in the known-individual/value case. However, the strict sense (each entity
has a known key, the same one in every model) is not expressible in first-order
logic. Of course, while we may impose functionality on keys (at least of a certain
type), we do not need to build that into EasyKeys. For example, it is easy to
imagine that certain classes of entities may have multiple keys, even of the same
sort: email addresses may be keys for people in many datasets and people can
have multiple email addresses. We can even more easily imagine that entities
may have keys of diverse sorts, e.g., a book may have an ISBN and a Library of
Congress catalog number.</p>
      <p>Property (P1) can be expressed as a DL-safe rule according to (R2) if we
assume the necessary DL-safety restrictions. For example, DL-safe Rule (1)
expresses that hasSSN is a key property.</p>
      <p>hasSSN(x, k), hasSSN(y, k) →
hasSSN(x, k), hasSSN(y, k), Person(x), Person(y) →
hasSSN(x, s), age(x, a), hasSSN(y, s), age(y, a) →
x = y
x = y
x = y
(1)
(2)
(3)
Key properties can also be restricted to certain classes, e.g., DL-safe Rule (2)
expresses that hasSSN is a key property for the class Person. We can also capture
compound keys, where several properties together act as a key. For example, we
can express that having the same social security number and being the same
age (we might reuse the SSN every year/generation) implies being the same
individual via DL-safe Rule (3).</p>
      <p>Thus, in principle, if OWL 2 had a suitable rule language, then users could
directly encode their key constraints as rules. Between the OWLED taskforce on
DL-safe SWRL rules3 and the Rule Interchange Format working group4 of the
W3C (not to mention the increasing number of DL-safe rule implementations),
it is pretty clear that some sort of more or less standard rule extension to OWL 2
will emerge in the next year. However, even if we had in hand a standard, widely
deployed rule language, requiring users to use rules to express key constraints is
a suboptimal choice for several reasons:
1. Such rules are not intention-revealing: they do not clearly signal that their
purpose is to express a key constraint. Having some sort of key-specific
constructor would make the intention of the expression clear to users. It would
make a similar difference if, say, the source code of a computer program
contained case distinctions expressed by switch statements as opposed to
nested if statements. It is equally common in natural, programming and
formal language to use defined constructs rather than their full definition. If
this were not the case, how could we search an ontology (or program) for
occurrences of transitivity statements or key constraints (or case distinctions)?
2. Using rules for key constraints requires a fair bit of sophistication and care in
their formulation. One must keep track of the variables, their relations, and
which sorts of atoms go where. Compare this with the transitivity operator:
one could express transitive properties as a (general) SWRL rule, but at the
price of having to coin three variables and understand how to set up the
rule. Though the latter may seem trivial, it is not: writing out transitivity
statements as rules is error-prone and obfuscating.
3. It may be possible, as in the transitivity case, to provide special support in
reasoners for keys. This would be, however, more difficult to implement if
first one needed to analyze the rule base for “key like” rules.</p>
      <p>These considerations led us to conclude that it was worth introducing special
syntax for keys with the restricted, DL-safe rule-esque semantics into OWL 2.
3</p>
    </sec>
    <sec id="sec-3">
      <title>EasyKey Details</title>
      <p>In this section, we will define key constraints that meet the above intuitions,
and explain how to obtain a tableau algorithm that incorporates all features of
OWL and key constraints. This will be achieved by combining existing tableau
algorithms for SROIQ—the designated OWL 2 description logic—with existing
tableau algorithms for DLs with datatypes and key constraints.</p>
      <p>
        We aim at making this theoretical part as accessible to users and
implementors as possible without being imprecise. Therefore we will only introduce those
details that are absolutely necessary, and refer the reader to the literature for
the missing parts. Readers who are primarily interested in modeling issues can
safely skip to Section 4. For a formal introduction into OWL, SROIQ, and DLs
with datatypes see [
        <xref ref-type="bibr" rid="ref1 ref16 ref17">16, 17, 1</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>3 http://wiki.webont.org/page/DL Safe SWRL Rules 4 http://www.w3.org/2005/rules/</title>
        <p>3.1</p>
        <sec id="sec-3-1-1">
          <title>Datatype Basics</title>
          <p>
            Datatypes in OWL 2 are restrictions of concrete domains [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ]. A datatype D is
a pair (ΔD, ΦD), where ΔD is a set and ΦD is a set of predicate names. Each
predicate name P ∈ ΦD is associated with an arity n and an n-ary predicate
P D ⊆ Δn . For example, if we drop the range restriction from Axiom 2 of O in
Figure 1,Dwe can use D = Z (the integers) and ΦD = {6, &gt;, &lt;, &gt;, 60, . . . }.
          </p>
          <p>
            Decision procedures for description logics with datatypes usually invoke
datatype reasoners. These decide whether a conjunction c of predicates from D over
some set of variables is satisfiable. To ensure the existence of an algorithm for
this satisfiability check, D has to be admissible [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ]. Essentially, this property
requires the set ΦD of predicates of D to contain a unary predicate &gt;D, to be closed
under negation, and it requires satisfiability of D-conjunctions to be decidable.
          </p>
          <p>
            In the presence of key constraints, standard decision procedures use a
variation of this datatype reasoner: in case c is satisfiable, the reasoner chooses a
solution s of c and outputs the set of variables that take on the same values
in s. To ensure the existence of such an algorithm, D has to be key-admissible
[
            <xref ref-type="bibr" rid="ref1">1</xref>
            ]. This means that D is admissible and there exists an algorithm that decides
satisfiability of D-conjunctions c and, in the positive case, nondeterministically
outputs an equivalence relation ∼ on the set of variables V used in c such that
there exists a solution s for c with the following property: for all v, v0 ∈ V ,
s(v) = s(v0) if and only if v ∼ v0. From now on, we will always assume
keyadmissibility of D, which is not too strong a restriction because it is already
satisfied if D is admissible and has an equality predicate.
3.2
          </p>
        </sec>
        <sec id="sec-3-1-2">
          <title>EasyKey Syntax and Semantics</title>
          <p>A key constraint is a statement of the form</p>
          <p>
            C hasKey (p1, . . . , pn, d1, . . . , dm),
(4)
^
i=1,...,n
where p1, . . . , pn are simple object properties, d1, . . . , dm are datatype properties,
and C is a class description. The requirement of pi being simple is necessary in
order to ensure decidability [
            <xref ref-type="bibr" rid="ref17 ref2">17, 2</xref>
            ]. The semantics of (4) in an ontology O can
be defined in two ways, using the following abbreviation.
α = C(x) ∧ C(y) ∧
pi(x, zi) ∧ pi(y, zi) ∧
di(x, vi) ∧ di(y, vi)
1. An interpretation I is a g-model of O and (4) if I is a model of O and
satisfies (5), which corresponds to reading (R1) from Section 1.
          </p>
          <p>∀xy ∃z1 . . . znv1 . . . vm . α → x = y
(5)
2. An interpretation I is an e-model of O and (4) if I is a model of O and
satisfies the following property, where HU is a predicate holding of all elements
of the Herbrand universe, i.e., HU holds of all individuals named in O.
∀xyh∃z1 . . . znv1 . . . vm α ∧ HU(x) ∧ HU(y) ∧
h</p>
          <p>HU(zi)i</p>
          <p>→ x = yi (6)
^
i=1,...,m</p>
          <p>^
Had we included conjuncts HU(vi) for each datatype variable vi, this
statement could have been captured by a DL-safe rule and would correspond to
reading (R2). Without these additional conjuncts, (6) corresponds to (R3).</p>
          <p>We will discuss in Section 4 why we forgo the “missing” portion of DL-safety.
We call key constraints with semantics (5) general key constraints and those with
semantics (6) EasyKey constraints. The former are a generalization of
inversefunctional datatype properties (IFDTPs), whose problematic behavior w.r.t.
reasoning they inherit, see Section 4.3. EasyKey constraints, on the contrary, are
orthogonal to IFDTPs, and we will see next why the can be viewed as being
computationally harmless.</p>
          <p>In the following, we distinguish between abstract individuals/variables
(instances of classes) and concrete ones (representing datatype values).
3.3</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>EasyKey Reasoning</title>
          <p>
            It suffices to restrict our attention to class satisfiability because all other standard
DL reasoning problems can be reduced to this problem [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ]. We will show how
to extend the tableau algorithm for SROIQ from [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ] to incorporate datatypes
and EasyKey constraints. This will be achieved by picking the necessary parts
from existing tableau algorithms for SHOQK(D) [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] and SHOQ(Dn) [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ]. Since
our extension of the SROIQ tableau algorithm will not affect its complexity,
reasoning with EasyKey constraints is not harder than without.
          </p>
          <p>
            We will not fully develop the algorithm here because the necessary technical
details would repeat large parts of the algorithms in [
            <xref ref-type="bibr" rid="ref1 ref17">1, 17</xref>
            ] and obstruct the view
on the essential parts. However, this section specifies all components needed to
implement an extension of an OWL 2 reasoner that treats EasyKey constrains.
          </p>
          <p>
            We expect the reader to be familiar with tableau algorithms as in [
            <xref ref-type="bibr" rid="ref17 ref19 ref21">21, 19,
17</xref>
            ] and start from the SROIQ tableau algorithm from [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ], including its rules,
its blocking condition, and its pruning and merging technique. Recall that the
NN -rule from this algorithm creates new individuals, but EasyKey constraints
only affect abstract nodes5 labelled with named individuals.
          </p>
          <p>
            Before we show how to reason in the presence of EasyKey constraints, let us
recall reasoning with non-inverse-functional datatype properties from [
            <xref ref-type="bibr" rid="ref20 ref22">22, 20</xref>
            ].
This is done with the following extension of standard tableau algorithms. To
each abstract node x, we associate a set DC(x) containing datatype constraints
of the form (x, P D(t1, . . . , tn)), where the ti are concrete successor nodes of x,
and P is an n-ary datatype predicate. The constraints to go into the DC(x) are
obtained via tableau expansion rules that have been given in [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ] as the datatype
counterparts of the usual ∀, ∃, &gt;, 6, and choose rules from standard tableau
algorithms [
            <xref ref-type="bibr" rid="ref22">22</xref>
            ]. If number restrictions with datatype properties are not required,
the ∀ and ∃ rules suffice. They are given in Figure 2. The instructions “update
∼” in these rules are only necessary below, in the presence of key constraints.
The &gt;, 6, and choose rules are way more technical, see [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ].
5 Abstract/concrete nodes in a tableau are nodes associated with abstract/concrete
individuals.
          </p>
          <p>∃-rule: if ∃g1, . . . , gn.P ∈ L(x), x is not blocked, and</p>
          <p>there are no gi-successors xi with P (x1, . . . , xn) ∈ DC(x)
then add a gi-successor of x for each 1 6 i 6 n,
for yi the gi-successor of x, add P (y1, . . . , yn) to DC(x), and
update ∼
∀-rule: if ∀g1, . . . , gn.P ∈ L(x), x is not blocked, and</p>
          <p>there are gi-successors yi of x with P (y1, . . . , yn) 6∈ DC(x)
then add P (y1, . . . , yn) to DC(x), and
update ∼</p>
          <p>
            For instance, if the abstract node x contains the label ∃height, weight.&gt;
expressing that the weight of the individual associated with node x is greater
than its height, then the datatype ∃-rule will create weight- and
height-successors y and z and add the constraint y &gt; z to DC(x). When detecting clashes,
DC(x) is tested for satisfiability by a datatype reasoner (DTR), whose negative
answer is a clash for x. (This requires D to be admissible.) It suffices to invoke
the DTR once after all completion rules have been exhaustively applied [
            <xref ref-type="bibr" rid="ref1 ref18">18, 1</xref>
            ].
          </p>
          <p>
            EasyKey constraints, as opposed to inverse-functional datatype properties,
do not impair tableau termination because they only affect named individuals.
The general extension of standard tableau algorithms to key constraints can be
taken from [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] and is as follows.
          </p>
          <p>As above, datatype constraints are collected and checked for satisfiability. But
since key constraints may cause abstract nodes labelled by named individuals
to merge, a separate construction and treatment of each DC(x) does not suffice.
Instead, for all abstract nodes x labelled with named individuals, all constraints
from labels in L(x) have to be put into a global set DC of constraints.</p>
          <p>The DTR that runs on DC has to answer more than “yes” if DC is satisfiable.
In addition, it has to nondeterministically output conditions under which DC
is satisfied, namely an equivalence relation ∼ on the set of variables occurring
in DC such that DC∼ ∪ {y 6= z | y 6∼ z} is satisfiable. Here DC∼ denotes DC
where each occurrence of a variable y is replaced by some fixed representative
of y’s equivalence class. The existence of such a DTR is ensured by the
keyadmissibility of D.</p>
          <p>
            After a positive answer of the DTR, abstract nodes are merged according
to ∼. As a consequence, concrete nodes might have to be merged as well—
but they might have incoming edges labelled with datatype properties that are
involved in key constraints. Therefore, DC needs to be updated, and the process
of running the DTR plus merging has to be iterated. Since, in each iterative step,
the “new” relation ∼ is a superset of the “old” one, this procedure terminates.
This is explained in more detail in [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ]. In particular, since it suffices to consider
∼ restricted to those abstract nodes that are labelled by named individuals, the
number of iterations is at most quadratic in the size of the ontology.
          </p>
          <p>Furthermore, the size of DC is polynomial in the size of the ontology: for
each abstract node x labelled with a named individual, x will only contribute as
many constraints to DC as it has outgoing edges. The number of the latter is
bounded by the sum of the numbers that occur in any number restriction in x’s
labels. Hence x contributes at most m · n constraints, where m is the maximal
number that occurs in number restrictions in x’s labels, and n is the count of
these restrictions. Since m, n, and the number of named individuals are bounded
by the size of the ontology, we obtain a cubic upper bound for the size of DC.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Key Design Alternatives</title>
      <p>Why are these keys easy? The reason is the usage of HU in the above definition
(6) of the semantics: key constraints only act on abstract individuals explicitly
present in our ontology, but not on those abstract individuals whose existence
is only implied. And key constraints “act” by inferring new equalities.</p>
      <p>In (6), the predicate HU is applied to all abstract variables, but not to
concrete ones. This is so because, for the former, it is (a) necessary for decidability
(DL-safety), and (b) it is easy to determine for which elements HU holds—but it
would be more problematic for datatype values and is not needed for decidability.</p>
      <p>
        For this subtlety in the usage of HU, EasyKeys cannot be captured by
DL-safe rules, according to the definition of the latter in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Furthermore, the
standard-resolution based proof in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] that OWL with rules is decidable would
not go through with our relaxed version of DL-safety. (But it might work for
a less standard way of resolution.) However, our relaxation of DL-safety does
not significantly impair standard tableau algorithms and hence “preserves” this
particular method of showing decidability. Section 3.3.
4.1
      </p>
      <sec id="sec-4-1">
        <title>Why Not Easiest Keys?</title>
        <p>The easiest keys would be according to the interpretation (R2) from Section
1. This special case of DL-safe rules would mean to handle the abstract and
concrete variables in precisely the same way. One could then preprocess the
ontology, looking for explicit concrete values, and treat them as the only
substitution candidates for the variables. This case would even simplify the operation
of the datatype reasoner DTR (see Section 3.3): for each generated or named
abstract individual, only the set DC(x) needs to be evaluated independently, while,
for EasyKeys, the datatype constraints for the named abstract individuals are
collected in the global set DC and evaluated.</p>
        <p>Unfortunately, easiest keys lead to some rather counterintuitive cases. Recall
the example ontology O from Figure 1 and the explanation why O does not
entail Mary = Sheevah under (R2). Let us now add the following assertion to O:
5 Individual: Demeter</p>
        <p>Types: hasAge value 42
This is a completely unrelated fact—it does not say anything about Demeter’s
type or hasSSN-value. But since now 42 is explicit in O, O entails Sheevah =</p>
        <p>hasKey hasSSN
2 Individual: Mary</p>
        <p>Types: Person, hasSSN integer[&gt; 41, &lt; 43]
3 Individual: Sheevah</p>
        <p>Types: Person
Mary. A similar anomaly can be observed for O0 from Figure 3. This ontology
does not entail Sheevah = Mary, but it will if we replace Axiom 2 by 2’.
2’ Individual: Mary</p>
        <p>Types: Person, hasSSN value 42
Such anomalies are the primary reason for adopting slightly less easy keys. They
cause quite some implementation challenge since large, but finite, enumerations
of key values, e.g., integer[&gt; 0, &lt; 1000], can yield nontrivial interactions.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>When Do Easiest Keys and EasyKeys Coincide?</title>
        <p>There is a sense in which EasyKeys’ extra expressivity covers currently
somewhat improbable corner cases. However, the consequences of not covering those
cases well seem quite negative: intuitively and wildly irrelevant assertions can
induce key-based identification. Furthermore, the underspecification of
userdefined datatypes in OWL 1 might have been a reason why this sort of case
has not emerged. OWL 2 has much more robustly specified datatype support,
so we can anticipate that ontology developers will be writing more and more
complex data constraints. It is exactly when users go beyond what is easy to
understand that helpful semantics with good reasoning support becomes essential.</p>
        <p>However, there are two cases where EasyKeys and the easiest keys coincide,
namely (1) when data values drawn from finite datatypes are not used as keys
(e.g., if the range restriction in Axiom 2 of O in Figure 1 is replaced with “[&gt; 0]”)
or (2) when all values of key properties are given explicitly. If the modeler takes
care to satisfy at least one of these properties, a much simpler “look and merge”
algorithm can be employed. Thus, EasyKeys have good pay-as-you-go behavior,
in principle, and we expect that implementations will take advantage of that.
The restriction is also fairly easy to understand and syntactically enforce.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Why Not Full Inverse-Functional Datatype Properties?</title>
        <p>If you restrict yourself to single key properties (i.e., no composite keys), then our
EasyKeys yield a subset of the entailments you can get with inverse-functional
datatype properties (IFDTPs). There are two main problems with IFDTPs:
1. While decidable, they are very hard to reason with. In fact, we are not aware
of any implementations or any implementor who relishes the challenge. We
have discussed in Section 3.3 that, without IFDTPs, datatype reasoning can
be confined to a single individual (asserted or generated) at a time. That is,
you gather all the data constraints applicable to the individual x and pass
them to the datatype reasoner DTR. If the DTR finds them consistent and
x is generated, you mark x as checked and can (largely) forget about it.
With IFDTPs, you lose this locality—all (named and generated) individuals
with datatype properties may need to be checked at any time. Since, in OWL,
not all of the potential key values may be explicit, this involves maintaining
a large constraint system DC that is constantly updated. Because generated
individuals have to be checked as well, termination of the algorithm will be
difficult to guarantee.
2. It is not clear that modelers would know what to do with such a powerful
feature, or even if they would want such power in most cases. Clearly, they
want keys, but it is not at all clear that they want IFDTPs.</p>
        <p>So, in essence, we would have a lot of implementation work (and research!) for a
not obvious gain. It is much easier to deliver robust, reasonably scalable support
for EasyKeys which, for most users, is a dominating consideration.</p>
        <p>What do we miss with EasyKeys? Basically, we miss the ability to entail
subsumptions through key constraints (at least, directly through key constraints; if
we have nominals, for example, we can force subsumptions in specific cases). This
is essentially the difference between full SWRL rules and DL-safe SWRL rules.
The major use case we could think of for this is aligning database schemas. If you
could reduce database schemas to OWL classes (including the key constraints as
IFDTPs), then your reasoner could show whether the two schemas were
equivalent or disjoint. If one had keys that were integers over 1000, and the other had
keys that were integers from 100 to 9999, we would get an inconsistency.</p>
        <p>To our minds, this is a very specialized application which, in fact, has never
occurred. If it does, one can always appeal to IFDTPs in OWL Full as an
extension. Indeed, one nice feature of EasyKeys is that they do not interfere with
general IFDTP support: on the one hand, they can be seen as a special case, on
the other, they can live side by side in an ontology just as DL-safe rules may
live side by side with corresponding transitivity axioms.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>In this paper we have presented a design for helpful, yet principled, key support
for OWL like languages. Our basic proposal has been accepted by the OWL
WG as part of the forthcoming OWL 2, and we expect it shall make it into
the next public working drafts. We have also discussed, in some detail, various
considerations that went into the design of EasyKeys and the tradeoffs we made
from the implementational and modeling perspectives. We have also presented
a direct algorithm for EasyKeys plus some indications on how optimized
implementations may be built. Our investigation into EasyKeys has also led us to a
new account of data property atom support in DL-safe rules.</p>
      <p>For future work, we plan to finish our running implementation of EasyKeys
and to study, more precisely, the pay-as-you-go behavior, particularly as one
scales up the data. We also plan to study their use in applications to determine
whether we in fact made the right tradeoffs.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Areces</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Keys, nominals, and concrete domains</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          <volume>23</volume>
          (
          <year>2005</year>
          )
          <fpage>667</fpage>
          -
          <lpage>726</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Nguyen</surname>
          </string-name>
          , T.D.T.,
          <string-name>
            <surname>Thanh</surname>
            ,
            <given-names>N.L.</given-names>
          </string-name>
          :
          <article-title>Integrating identification constraints in web ontology</article-title>
          . In Cardoso, J.,
          <string-name>
            <surname>Cordeiro</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Filipe</surname>
          </string-name>
          , J., eds.:
          <source>ICEIS (1)</source>
          . (
          <year>2007</year>
          )
          <fpage>338</fpage>
          -
          <lpage>343</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Penwill</surname>
          </string-name>
          , C.D.:
          <article-title>A tableaux decision procedure for SHIQ(Dn) and SHOIQ(Dn)</article-title>
          .
          <source>Master's thesis</source>
          , School of Computer Science, University of Manchester (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Studer</surname>
          </string-name>
          , R.:
          <article-title>Query answering for OWL-DL with rules</article-title>
          .
          <source>J. of Web Semantics</source>
          <volume>3</volume>
          (
          <issue>1</issue>
          ) (
          <year>2005</year>
          )
          <fpage>41</fpage>
          -
          <lpage>60</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Borgida</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weddell</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          :
          <article-title>Adding uniqueness constraints to description logics (preliminary report)</article-title>
          .
          <source>In: Proc. of DOOD-97. Volume 1341 of LNCS</source>
          . (
          <year>1997</year>
          )
          <fpage>85</fpage>
          -
          <lpage>102</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Keys for free in description logics</article-title>
          .
          <source>In: Proc. of DL</source>
          <year>2000</year>
          , CEUR (http://ceur-ws.
          <source>org/)</source>
          (
          <year>2000</year>
          )
          <fpage>79</fpage>
          -
          <lpage>88</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Khizder</surname>
            ,
            <given-names>V.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weddell</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          :
          <article-title>On decidability and complexity of description logics with uniqueness constraints</article-title>
          .
          <source>In: Proc. of ICDT-01</source>
          . Volume 1973 of LNCS. (
          <year>2001</year>
          )
          <fpage>54</fpage>
          -
          <lpage>67</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weddell</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          :
          <article-title>On attributes, roles, and dependencies in description logics and the Ackermann case of the decision problem</article-title>
          .
          <source>In: Proc. of DL</source>
          <year>2001</year>
          , CEUR (http://ceur-ws.
          <source>org/)</source>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weddell</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          :
          <article-title>On the interaction between inverse features and pathfunctional dependencies in description logics</article-title>
          .
          <source>In: Proc. of IJCAI-05</source>
          . (
          <year>2005</year>
          )
          <fpage>603</fpage>
          -
          <lpage>608</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weddell</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          :
          <article-title>On path-functional dependencies as first-class citizens in description logics</article-title>
          .
          <source>In: Proc. of DL</source>
          <year>2005</year>
          , CEUR (http://ceur-ws.
          <source>org/)</source>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Milicic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Description logics with concrete domains and functional dependencies</article-title>
          .
          <source>In: Proc. of ECAI</source>
          <year>2004</year>
          .
          <article-title>(</article-title>
          <year>2004</year>
          )
          <fpage>378</fpage>
          -
          <lpage>382</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hollunder</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Embedding defaults into terminological knowledge representation formalisms</article-title>
          .
          <source>In: Proc. of KR-92</source>
          , Morgan Kaufmann (
          <year>1992</year>
          )
          <fpage>306</fpage>
          -
          <lpage>317</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Donini</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <article-title>Description logics of minimal knowledge and negation as failure</article-title>
          .
          <source>ACM Trans. Comput. Log</source>
          .
          <volume>3</volume>
          (
          <issue>2</issue>
          ) (
          <year>2002</year>
          )
          <fpage>177</fpage>
          -
          <lpage>225</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Bridging the gap between OWL and relational databases</article-title>
          .
          <source>In: Proc. of WWW-07</source>
          . (
          <year>2007</year>
          )
          <fpage>807</fpage>
          -
          <lpage>816</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>A faithful integration of description logics with logic programming</article-title>
          .
          <source>In: Proc. of IJCAI-07</source>
          . (
          <year>2007</year>
          )
          <fpage>477</fpage>
          -
          <lpage>482</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>van Harmelen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <string-name>
            <surname>From</surname>
            <given-names>SHIQ</given-names>
          </string-name>
          and
          <article-title>RDF to OWL: The making of a web ontology language</article-title>
          .
          <source>J. of Web Semantics</source>
          <volume>1</volume>
          (
          <issue>1</issue>
          ) (
          <year>2003</year>
          )
          <fpage>7</fpage>
          -
          <lpage>26</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kutz</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>The even more irresistible SROIQ</article-title>
          .
          <source>In: Proc. of KR-06</source>
          . (
          <year>2006</year>
          )
          <fpage>57</fpage>
          -
          <lpage>67</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanschke</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A schema for integrating concrete domains into concept languages</article-title>
          .
          <source>In: Proc. of IJCAI-91</source>
          ,
          <string-name>
            <surname>Sydney</surname>
          </string-name>
          (
          <year>1991</year>
          )
          <fpage>452</fpage>
          -
          <lpage>457</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>A tableau decision procedure for SHOIQ</article-title>
          .
          <source>J. of Automated Reasoning</source>
          <volume>39</volume>
          (
          <year>2007</year>
          )
          <fpage>249</fpage>
          -
          <lpage>276</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>J.Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Semantic web ontology reasoning in the SHOQ(Dn) description logic</article-title>
          .
          <source>In: Proc. of DL</source>
          <year>2002</year>
          .
          <article-title>CEUR (http://ceur-ws</article-title>
          .
          <source>org/)</source>
          (
          <year>2002</year>
          )
          <fpage>53</fpage>
          -
          <lpage>62</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tobies</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Practical reasoning for very expressive description logics</article-title>
          .
          <source>Logic Journal of the IGPL</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          ) (May
          <year>2000</year>
          )
          <fpage>239</fpage>
          -
          <lpage>264</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Ontology reasoning in the SHOQ(D) description logic</article-title>
          . In Nebel, B., ed.
          <source>: Proc. of IJCAI-01</source>
          , Morgan Kaufmann (
          <year>2001</year>
          )
          <fpage>199</fpage>
          -
          <lpage>204</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>