<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Preliminary Results on the Identity Problem in Description Logic Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Franz Baader</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Borchmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Theoretical Computer Science</institution>
          ,
          <addr-line>TU Dresden</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The work in this paper is motivated by a privacy scenario in which the identity of certain persons (represented as anonymous individuals) should be hidden. We assume that factual information about known individuals (i.e., individuals whose identity is known) and anonymous individuals is stored in an ABox and general background information is expressed in a TBox, where both the TBox and the ABox are publicly accessible. The identity problem then asks whether one can deduce from the TBox and the ABox that a given anonymous individual is equal to a known one. Since this would reveal the identity of the anonymous individual, such a situation needs to be avoided. We first observe that not all Description Logics (DLs) are able to derive any such equalities between individuals, and thus the identity problem is trivial in these DLs. We then consider DLs with nominals, number restrictions, or function dependencies, in which the identity problem is non-trivial. We show that in these DLs the identity problem has the same complexity as the instance problem. Finally, we consider an extended scenario in which users with different rôles can access different parts of the TBox and ABox, and we want to check whether, by a sequence of rôle changes and queries asked in each rôle, one can deduce the identity of an anonymous individual.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In order to illustrate the privacy scenario sketched in the abstract, assume that
you are asked to perform a survey regarding the satisfaction of employees with
the management of a company. Since the boss of the company is known not
to respond well to criticism, the employees insist that you perform the survey
such that the identity of persons voicing criticism cannot be deduced by the
boss. Thus, you let the employees use a pseudonym when answering the survey.
However, the survey does ask some personal data from the participants, and
you are concerned that the boss can use the provided answers, in combination
with the employee database and general knowledge about how things work in the
company, to deduce that a certain pseudonym corresponds to a specific employee.
For example, assume that in the survey the anonymous individual x states that
she is female and has expertise in logic and privacy. The boss knows that all
? Funded by DFG within the Research Training Group 1907 “RoSI”.
employees with expertise logic belong to the formal verification task force and
all employees with expertise privacy belong to the security task force. In addition,
the employee database contains the information that the members of the first
task force are John, Linda, Paul, Pattie and of the second Jim, John, Linda,
Pamela. Since Linda is the only female employee belonging to both task forces,
the boss can deduce that Linda hides behind the pseudonym x. The question is
now whether you can use an automated system to check whether such a breach
of privacy can occur in your survey.</p>
      <p>The purpose of this paper is to show that ontology reasoners can in principle
be used for this purpose. We assume that both the information provided in the
survey and the employee database are represented in a DL ABox A, where the
employees from the database are represented as known individuals in A and
the pseudonyms used in the survey are represented as anonymous individuals
in A. Background information (such as disjointness of the concepts Male and
Female, or the connection between expertise and task forces) are represented in
a DL TBox T . In order to detect a breach of privacy, we then need to check
whether the ontology O consisting of T and A implies an identity between
some anonymous individual x and a known individual a. We call the underlying
reasoning task the identity problem for O, x, and a.</p>
      <p>
        In Section 2 we formally introduce the identity problem and show that, for
a large class of DLs, this problem is trivial in the sense that no identities
between distinct individuals can be deduced from consistent ontologies formulated
in these DLs. Not surprisingly, this class consists of the DLs that are fragments
of first-order logic without equality. In Section 3, we introduce three DLs for
which the identity problem is non-trivial, i.e., the DL ALCO [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], where
nominals allow us to derive identities; ALCQ [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], where number restrictions allow
us to derive identities; and CF Dnc [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], where functional dependencies allow
us to derive identities. In Section 4 we show that the identity problem can be
reduced in polynomial time to the instance problem, and that for the three
DLs mentioned above this actually yields an optimal procedure w.r.t. worst-case
complexity. Section 5 considers the identity problem in the context of rôle-based
access control [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] to ontologies. Basically, we assume that a user rôle r^ is
associated with access to a subset Or^ of the ontology.1 While having rôle r^, the user
can access Or^ through queries, and can then store the result in a view Vr^. In a
setting where rôles can dynamically change, the user may have collected (and
stored) a sequence of views for different rôles. The question is then whether it
is possible to derive the identity of an anonymous individual with a known one
using these views. We will show that answering this question can be reduced to
the identity problem investigated in the previous sections.
      </p>
      <p>
        Similar privacy scenarios have been considered for databases [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], but also in
the context of ontology-based data access [
        <xref ref-type="bibr" rid="ref13 ref4 ref5">5,13,4</xref>
        ]. In particular, [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] introduces
a setting with sub-ontologies and views that is similar to what we consider in
Section 5. However, the main difference between these works and ours is that we
1 To distinguish user rôles from DL roles, we write them with “ô” and also denote
specific such rôles with letters with a hat.
concentrate on hiding the identity of an anonymous individual with a known one.
In contrast, the other works are trying to hide properties of known individuals,
i.e., the membership of an individual (or a tuple of individuals) in the answers
to certain queries.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>The Identity Problem</title>
      <p>
        We assume that the reader is familiar with the basic notions of Description
Logics, as e.g. introduced in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We denote the set of concept names by NC ,
the set of role names by NR, and the set of individual names by NI . TBoxes
and ABoxes as well as their models are assumed to be defined in the standard
way. Note, however, that we do not make the unique name assumption (UNA)
for individual names. An ontology is of the form O = (T ; A) where T is a
TBox and A is an ABox. The identity problem asks whether two individuals
are equal w.r.t. a given ontology. Since anything (also identities) follows from
an inconsistent ontology, we consider this problem only for the case where the
ontology is consistent.
      </p>
      <p>Definition 1. Let a; b 2 NI be distinct individual names and O a consistent
:
ontology. Then a is equal to b w.r.t. O (denoted by O j= a = b) iff aI = bI for
:
all models I of O. The identity problem for O; a; b asks whether O j= a = b.</p>
      <p>Not all DLs are able to derive equality of individuals. We call those that can
DLs with equality power.</p>
      <p>Definition 2. L is a description logic without equality power if there is no
consistent ontology O formulated in L and two distinct individual names a; b 2
:
NI such that O j= a = b. Otherwise we say that L has equality power.</p>
      <p>
        It is well-known (see, e.g., Chapter 6 in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) that many DLs can be
translated into first-order predicate logic (FOL). Basically, concept names and role
names are translated into unary and binary predicates, respectively, and complex
concept descriptions are translated into FOL formulas with one free variable.
Individual names are translated into constant symbols and TBoxes and ABoxes
into closed formulas. For the translation of some DLs, FOL without equality is
sufficient whereas for others equality is needed.
      </p>
      <p>Theorem 1. If the DL L can be translated into FOL without equality, then it
is a DL without equality power.</p>
      <p>Proof. Let O = (T ; A) be a consistent ontology o=f La =a:ndb. aA; cbco2rdNinIg bteo towuor
distinct individual names. We must show that O 6j
assumption on L, there is an FOL formula not containing the equality symbol
that is equivalent to O. Consequently, it is sufficient to show that 6j= a = b
according to the semantics of FOL, where the equality symbol = is interpreted
as equality. Since O is consistent, the formula is satisfiable.</p>
      <p>Using well-known approaches and results regarding FOL, we can transform
into a formula 0 in Skolem form containing additional function symbols such
that (i) is satisfiable iff 0 is satisfiable, and (ii) any model of 0 is a model of .
Thus, 0 is satisfiable and since it is in Skolem form it has a Herbrand model IH .
Since 0 does not contain equality, distinct terms (and thus in particular distinct
constants) are interpreted by distinct elements in IH . Finally, we know that IH
is also a model of , which shows that there is a model of in which a and b are
not interpreted by the same domain element. This proves 6j= a = b. tu</p>
      <p>As a consequence of this theorem, we conclude that the basic DL ALC and
its fragments, but also more expressive DLs such as SRI, do not have equality
power, and thus the identity problem is trivial for these DLs.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Three DLs with Equality Power</title>
      <p>
        In this section, we introduce three DLs that are able to derive equalities between
individuals, and for which thus the identity problem is non-trivial. The first two
DLs are ALCO, which extends ALC by nominals, and ALCQ, which extends
ALC by qualified number restrictions. Since these DLs are standard, we refer the
reader to [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for the definition of their syntax and semantics. The third DL, called
CF Dnc [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], derives its equality power from so-called functional dependencies.
We leave it to the reader to verify that expressing these logics in FOL really
requires the equality symbol.
      </p>
      <p>
        Since the DL CF Dnc is less standard and not introduced in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we describe it
in more detail. Instead of roles, this logic uses attributes, which are interpreted
as total functions. We use the symbol NA to denote the set of all attributes,
replacing the set NR. Concept descriptions C; D of CF Dnc are defined using the
following syntax rules:
      </p>
      <p>C; D ::= A j :A j C u D j 8Pf:C j A : Pf1; : : : ; Pfk ! Pf;
where A 2 NC , k 1, and the path functions Pf; Pfi are words in NA with the
convention that the empty word is denoted by id. A concept descriptions of the
form A : Pf1; : : : ; Pfk ! Pf is called a path functional dependency (PFD). In
CF Dnc there is an additional restriction on PFDs to ensure that reasoning in
this logic is polynomial: for any PFD of the form above there is an i; 1 i k
such that
1. Pf is a prefix of Pfi, or
2. Pf = Pf0f for f 2 NA and Pf0 is a prefix of Pfi.</p>
      <p>Note that PFDs whose right-hand side Pf has length 1 trivially satisfy this
restriction.</p>
      <p>The interpretation of attributes as total functions is extended to path
functions by using composition of functions and interpreting id as the identity
function. The semantics of atomic negation (:A) and conjunction (C u D) is defined
in the usual way. For the constructors involving path functions, it is defined as
follows:
(8Pf:C)I := fd 2</p>
      <p>I j PfI (d) 2 CI g;
(A : Pf1; : : : ; Pfk ! Pf)I :=
0
1 i k
1</p>
      <p>I j 8e 2 AI : @ ^ PfiI (d) = PfiI (e)A ) PfI (d) = PfI (e)g
A TBox T in CF Dnc consists of a finite set of inclusion dependencies A v C, and
an ABox A consists of a finite set of concept assertions A(a) and path function
assertions Pf1(a) = Pf2(b), where A 2 NC , C is a CF Dnc concept description,
a; b 2 NI , and Pfi 2 NA.</p>
      <p>Theorem 2. The DLs ALCO, ALCQ, and CF Dnc have equality power.
This theorem is an immediate consequence of the following three examples, which
each shows for the respective DL that it can derive equality between different
individuals.</p>
      <p>Example 1. Here we formulate the example from the introduction in the DL
ALCO. Let O = (T ; A) where</p>
      <p>T := f9expertise:fLOGICg v VerTF; 9expertise:fPRIVACYg v SecTF;
VerTF fJOHN; LINDA; PAUL; PATTIEg;</p>
      <p>SecTF fJIM; JOHN; LINDA; PAMELAg; Female v :Maleg;
A := fFemale(x); expertise(x; LOGIC); expertise(x; PRIVACY);</p>
      <p>Female(LINDA); Female(PATTIE); Female(PAMELA);
Male(JOHN); Male(JIM); Male(PAUL)g:</p>
      <p>:
It is easy to see that O j= x = LINDA since x’s expertise implies that she belongs
to both the verification and the security task force, but the only female employee
belonging to both is Linda.</p>
      <p>For the sake of brevity, we use abstract examples to show that ALCQ and
CF Dnc have equality power. It would, however, be easy to provide intuitive
examples also for these two DLs.</p>
      <p>Example 2. Consider the ALCQ ontology O = (T ; A) where</p>
      <p>T := fA v 6 1 r:Bg and A := fA(a); r(a; b); r(a; x); B(a); B(x)g:
:
Obviously, we have O j= x = b.</p>
      <p>Example 3. Consider the CF Dnc ontology O = (T ; A) where</p>
      <p>T := fA v A : f ! idg and A := fA(a); f (a) = b; A(x); f (x) = bg:
Since both x and a belong to A and have the same value b for the path function
f , the path functional dependency in T implies that they must be equal, i.e., we
:
have O j= x = a.</p>
    </sec>
    <sec id="sec-4">
      <title>The Complexity of the Identity Problem</title>
      <p>In this section, we first show that the identity problem can be polynomially
reduced to the instance problem for all DLs with equality power. Note that
the instance problem is one of the basic inference problems for DLs, and thus
instance checking facilities are available in most DL reasoners.</p>
      <p>Lemma 1. Let L be a DL with equality power, O = (T ; A) an L ontology and
a; b two distinct individual names. If B is a concept name not occurring in O,
then we have</p>
      <p>:</p>
      <p>
        O j= a = b iff (T ; A [ fB(a)g) j= B(b):
Proof. The direction from left to right is trivial. We show the other direction
by contraposition. Thus, assume that O 6j= a =: b. Let I be a model of O such
that aI 6= bI . Let I0 be the interpretation that coincides with I on all role
names, individual names, and concept names different from B. For B we define
BI0 := faI g. Since B does not occur in O, the interpretation I0 is still a model
of T and A, and it satisfies B(a) by our definition of BI0 . However, it does not
satisfy B(b) since bI0 = bI 6= aI does not belong to BI0 . tu
This lemma shows that the identity problem is at most as complex as the instance
problem for all DLs with equality power that allow instance assertions for concept
names in the ABox. Since the instance problem is polynomial for CF Dnc [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ],
this implies that also the identity problem is polynomial for this DL. In [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] it is
mentioned that P-hardness of the consistency problem for CF Dnc ontologies is
an easy consequence of P-hardness of satisfiability of propositional Horn formulas
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. It is easy to see that the same is true also for the identity problem.
Theorem 3. The identity problem is P-complete for CF Dnc ontologies.
      </p>
      <p>
        For ALCO and ALCQ, the instance problem is ExpTime-complete [
        <xref ref-type="bibr" rid="ref10 ref16">10,16</xref>
        ].
Thus, we obtain exponential-time upper bounds for the identity problem in these
DLs. To show that these upper bounds are optimal, we prove that there are
polynomial-time reductions of the instance problem in ALC to the identity
problem in these logics. In fact, the instance problem is already ExpTime-hard for
the common sub-logic ALC of ALCO and ALCQ [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>Lemma 2. Let L 2 fALCO; ALCQg, O be an ALC ontology, C an ALC concept
description, and a an individual name. Then we can construct in polynomial time
an L ontology O0 and individuals a0; b0 such that</p>
      <p>O j= C(a) iff O0 j= a0 =: b0:
Proof. Let O = (T ; A). We consider the two DLs separately.
1.) L = ALCO:</p>
      <p>We define O0 := (T [ fC v 8r:fb0gg; A [ fr(a; a0); r(a; b0)g), where a0; b0 are
distinct individual names and r is a role name such that a0; b0; r do not occur
in O. The direction from left to right is again trivial. The other direction is
shown by contraposition. Let I be a model of O such that aI 62 CI . We can
assume without loss of generality that the domain of I contains at least two
distinct elements d1 6= d2.2 We construct an interpretation I0 that coincides
with I on all concept, role, and individual names occurring in O, and thus
is also a model of O. In addition, I0 interprets r as rI0 := f(aI ; d1); (aI ; d2)g
and the new individual names as a0I0 := d1 and b0I0 := d2. By construction,
I0 satisfies the assertional part of O0. To see that it also satisfies the GCI
C v 8r:fb0g, note that aI = aI0 is the only element of I0 that has successors
w.r.t. the role r. Since it does not belong to CI = CI0 , the elements of
CI0 trivially satisfy the value restriction 8r:fb0g. Thus, I0 is a model of O0
in which the individuals a0; b0 are interpreted by different elements, which
:
shows O0 6j= a0 = b0.
2.) L = ALCQ:</p>
      <p>We define O0 := (T [ fC v 6 1 r:&gt;g; A [ fr(a; a0); r(a; b0)g), where a0; b0
are distinct new individuals and r is a new role name not occurring in O.
The direction from left to right is again trivial. To show the other direction,
assume that I is a model of O such that aI 62 CI . Again, we assume without
loss of generality that the domain of I contains at least two distinct elements
d1 6= d2. We construct an interpretation I0 in the same way as in case
1. above. Also, the argument why I0 is a model of O0 in which a0; b0 are
interpreted by different elements is identical to the one above.
tu</p>
      <p>As an easy consequence of Lemma 1 and Lemma 2 we obtain the exact
complexity of the identity problem in ALCO and ALCQ. In fact, Lemma 1 yields
ExpTime upper bounds. To show that Lemma 2 indeed yields ExpTime lower
bounds, we need to take into account the fact that we have defined the identity
problem with only consistent ontologies as possible input. However, it is easy
to see that ExpTime-hardness of the instance problem in ALC also holds if we
consider the instance problem only for consistent ALC ontologies O. In addition,
if O is a consistent ALC ontology, then so are the ontologies O0 constructed from
it in the proof of Lemma 1.</p>
      <p>Theorem 4. The identity problem is ExpTime-complete for ALCO and ALCQ
ontologies.</p>
      <p>For the three DLs with equality power considered in this paper, the identity
problem has the same complexity as the instance problem. A natural question to
ask is whether this is always the case. A simple example shows that the answer
to this question is negative. In fact, let ALC= be the DL ALC, with the only
difference that ALC= ABoxes may contain equality assertions a =: b between
individual names. It is easy to see that the identity problem in this DL is
nontrivial, but it can be solved in polynomial time. In fact, to check whether a
2 This follows from the fact that models of ALC ontologies are closed under disjoint
union. Note that this assumption could not be made without loss of generality for
ALCO ontologies. For example, O = (f&gt; v fagg; ;) has only models of size 1.
:
consistent ALC= ontology implies an equality a = b, we only need to construct
the reflexive, transitive, and symmetric closure of the explicitly stated equalities.
However, since ALC is a sub-logic of ALC=, the instance problem in this DL is
ExpTime-hard (and it is easy to show that it is also in ExpTime).</p>
      <p>
        One may also wonder whether the complexity of the instance problem can be
transferred to the identity problem also for DLs where the instance problem has a
higher complexity than ExpTime. For example, the DL ALCOIQ, which extends
both ALCO and ALCQ and additionally allows the use of inverse roles, has a
coNExpTime-complete instance problem [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Since it contains ALCO, it has
equality power and can force models to have cardinality 1. Lemma 1 implies that
the identity problem in ALCOIQ is in coNExpTime. Regarding hardness, the
reductions employed in the proof of Lemma 2 can in principle both be used since
the constructors employed in them are available in ALCOIQ. However, Lemma 2
uses an ALC ontology O in the reduction, which yields only an ExpTime lower
bound. Simply using an ALCOIQ ontology instead does not work since the
proof depends on the fact that O has models refuting the instance relation of
cardinality at least 2. However, by looking at the coNExpTime-hardness proof
for the instance problem in ALCOIQ, it is easy to see that the following modified
instance problem is also coNExpTime-hard in ALCOIQ: is a an instance of C
in all models of O of cardinality 2. Thus, one can without loss of generality
restrict the attention to models of cardinality 2 when reducing the instance
problem for ALCOIQ to the identity problem for this logic.
      </p>
      <p>Theorem 5. The identity problem is coNExpTime-complete for ALCOIQ
ontologies.
5</p>
    </sec>
    <sec id="sec-5">
      <title>The View-based Identity Problem</title>
      <p>
        In this section, we will adapt the approach of [
        <xref ref-type="bibr" rid="ref12 ref13">13,12</xref>
        ] for view-based information
hiding such that it can formalize the rôle-based access control scenario sketched
in the introduction. We assume that ontologies are written using some DL L
with equality power.
      </p>
      <p>To define what kind of information is to be hidden, we divide the set of
individual names into the disjoint sets NAI and NKI consisting of anonymous
and known individuals, respectively. As before, we do not make the unique name
assumption for these individuals. Given an anonymous individual x 2 NAI and
an ontology O, we define the identity of x w.r.t. O as
:
idn(x; O) := fb 2 NKI j O j= x = bg:
Note that b; b0 2 idn(x; O) implies that O j= b0 =: b. Thus, if the cardinality
of idn(x; O) is greater 1, this does not mean that x is equal to one of these
individuals, but rather that it is equal to all of them (and thus that all of them
are equal). We say that x is a hidden if idn(x; O) = ;.</p>
      <p>In the rôle-based access control scenario we assume that there is a “large”
input ontology OI that is always consistent, but users can only see a part of
it depending on which rôle they currently have. More formally, we assume that
there is a finite set of user rôles R, and that playing the rôle r^ 2 R gives access
to a subset Or^ OI of the input ontology. Here “access” does not mean that a
user with rôle r^ can download the ontology Or^. Instead, the users can ask queries
to Or^, where a subsumption query is of the form C v D for concept descriptions
C; D and a retrieval query is of the form C for concept descriptions C or r for
role names r.</p>
      <p>Definition 3. Let OI be the input ontology, Or^ OI the ontology accessible by
users with rôle r^ 2 R, and q be a query. The answer to q w.r.t. r^, denoted by
ans(q; r^), is defined as follows:
– ans(q; r^) := ftrueg, if q = C v D and Or^ j= C v D,
– ans(q; r^) := ;, if q = C v D and Or^ 6j= C v D,
– ans(q; r^) := fa 2 NI j Or^ j= C(a)g, if q = C,
– ans(q; r^) := f(a; b) 2 NI NI j Or^ j= r(a; b)g, if q = r.</p>
      <p>Since Or^ OI , positive answers to queries, i.e., ans(C v D; r^) = ftrueg,
a 2 ans(C; r^), or (a; b) 2 ans(r; r^), imply that this subsumption, instance, or
role relationship also holds in OI . In contrast, negative answers do not tell us
anything about what holds in OI since the inclusion may be strict. Answers to
queries w.r.t. rôle r^ can be stored in a view.</p>
      <p>Definition 4. A view is a total function V : dom(V ) ! 2NI [2NI NI [fftruegg
where the view definition dom(V ) is a finite set of queries and V (q) is a finite
set for all q 2 dom(V ). This view is a view for r^ 2 R (written r^ j= V ) if
V (q) = ans(q; r^) holds for all q 2 dom(V ).</p>
      <p>In a setting where user rôles can dynamically change, a user may successively
play rôles r^1; r^2; : : : ; r^k, in each rôle r^i generating (and storing) a view Vr^i for r^i
by asking queries. The question is now whether these views can be used to find
out the identity of a given anonymous individual x 2 NAI . Assume that the user
wants to know whether there is a b 2 NKI such that b 2 idn(x; OI ). However, the
user cannot access OI as a whole, all she knows is that the positive answers to
the queries in the views Vr^i are justified by subsets of OI . Consequently, instead
of one (unknown) ontology OI , the user needs to consider all possible ontologies,
i.e., all ontologies that are compatible with the positive answers in the views.
Definition 5. The ontology P is a possible ontology for the sequence of views
Vr^1 ; : : : ; Vr^k if P is consistent and compatible with all positive answers in these
views, where P is compatible with
– Vr^i (C v D) = ftrueg if P j= C v D,
– a 2 Vr^i (C) if P j= C(a), and (a; b) 2 Vr^i (r) if P j= r(a; b).</p>
      <p>We denote the set of all possible ontologies for Vr^1 ; : : : ; Vr^k with Poss(Vr^1 ; : : : ; Vr^k ).
The certain identity of x w.r.t. Vr^1 ; : : : ; Vr^k is defined as
cert _idn(x; Vr^1 ; : : : ; Vr^k ) :=</p>
      <p>\
P2Poss(Vr^1 ;:::;Vr^k )
idn(x; P):
We say that x is hidden w.r.t. Vr^1 ; : : : ; Vr^k if cert _idn(x; Vr^1 ; : : : ; Vr^k ) = ;.</p>
      <p>Since OI 2 Poss(Vr^1 ; : : : ; Vr^k ), we know that b 2 cert _idn(x; Vr^1 ; : : : ; Vr^k )
implies that b 2 idn(x; OI ). Thus, if cert _idn(x; Vr^1 ; : : : ; Vr^k ) 6= ;, the
identity of x in OI is no longer hidden. Conversely, if cert _idn(x; Vr^1 ; : : : ; =Vr^kx) ==: ;b.
,
then for all b 2 NKI there is a P 2 Poss(Vr^1 ; : : : ; Vr^k ) such that P 6j
Since, according to the information available to the user, OI could be this P,
:
she cannot conclude for any b 2 NKI that OI j= x = b. This shows that
cert _idn(x; Vr^1 ; : : : ; Vr^k ) = ; indeed corresponds to the fact that the views
Vr^1 ; : : : ; Vr^k do not disclose the identity of x.</p>
      <p>Since the set Poss(Vr^1 ; : : : ; Vr^k ) consists of infinitely many ontologies, the
definition of cert _idn(x; Vr^1 ; : : : ; Vr^k ) does not directly yield an approach for
computing this set. We will now show that we can reduce this computation to
the identity problem for the canonical ontology of Vr^1 ; : : : ; Vr^k . Basically, this
ontology consists of the GCIs, concept assertions, and role assertions obtained
from the positive answers in the views.</p>
      <p>Definition 6. The canonical ontology C(Vr^1 ; : : : ; Vr^k ) of Vr^1 ; : : : ; Vr^k is defined
as C(Vr^1 ; : : : ; Vr^k ) := (T ; A) where</p>
      <p>T := fC v D j Vr^i (C v D) = ftrueg for some i; 1
A := fC(a) j a 2 Vr^i (C) for some i; 1 i kg [
fr(a; b) j (a; b) 2 Vr^i (r) for some i; 1 i
kg:
i
kg
Since C(Vr^1 ; : : : ; Vr^k ) consists of all positive answers in the views Vr^1 ; : : : ; Vr^k , it
clearly implies them, and thus C(Vr^1 ; : : : ; Vr^k ) 2 Poss(Vr^1 ; : : : ; Vr^k ). Conversely,
every ontology P 2 Poss(Vr^1 ; : : : ; Vr^k ) implies all these positive answers, and
thus all the GCIs, concept assertions, and role assertions in C(Vr^1 ; : : : ; Vr^k ). This
implies that every consequence of C(Vr^1 ; : : : ; Vr^k ) is also a consequence of P.
Theorem 6. Given views Vr^1 ; : : : ; Vr^k and an anonymous individual x 2 NAI ,
we have cert_idn(x; Vr^1 ; : : : ; Vr^k ) = idn(x; C(Vr^1 ; : : : ; Vr^k )):
:
Proof. First assume that b 2 cert_idn(x; Vr^1 ; : : : ; Vr^k ). Then we have P j= x = b
for all P 2 Poss(Vr^1 ; : : : ; V:r^k ). Since C(Vr^1 ; : : : ; Vr^k ) 2 Poss(Vr^1 ; : : : ; Vr^k ), this
yields C(Vr^1 ; : : : ; Vr^k ) j= x = b, and thus b 2 idn(x; C(Vr^1 ; : : : ; Vr^k )).</p>
      <p>Conversely, assume b 2 idn(x; C(Vr^1 ; : : : ; Vr^k )), and thus C(Vr^1 ; : : : ; Vr^k ) j=
x =: b. We must show that, for all P 2 Poss(Vr^1 ; : : : ; Vr^k ), we have P j= x =
:
b. This is an immediate consequence of the fact that all the consequences of
C(Vr^1 ; : : : ; Vr^k ) are also consequences of P. tu</p>
      <p>This theorem shows that, to check whether x is hidden w.r.t. Vr^1 ; : : : ; Vr^k ,
it is sufficient to compute idn(x; C(Vr^1 ; : : : ; Vr^k )). In case the employed ontology
language L allows for unrestricted GCIs, concept assertions, and role assertions,
the set idn(x; C(Vr^1 ; : : : ; Vr^k )) can clearly be computed using an algorithm that
solves the identity problem for L ontologies a polynomial number of times. Note
that this applies to the DLs ALCO, ALCQ, and ALCOIQ considered in the
previous sections, but not to CF Dnc since there GCIs and concept assertions
need to satisfy certain restrictions.</p>
      <p>Corollary 1. For L 2 fALCO; ALCQg we can check in exponential time whether
an anonymous individual x is hidden w.r.t. views Vr^1 ; : : : ; Vr^k . For L = ALCOIQ,
this problem can be solved in NExpTime.</p>
      <p>The ExpTime upper bound for ALCO and ALCQ is obvious. For ALCOIQ,
one considers all the (polynomially many) known individuals a1; : : : ; ap. Using
a NExpTime procedure for the complement of the identity problem, one then
checks whether x is not identical to a1. The non-successful paths of this
nondeterministic computation stop with failure whereas the successful ones continue
with the same test for a2, etc. It is easy to see that this yields the desired
NExpTime procedure. In fact, any path of this procedure has only exponential
length, and a successful path indicates that inequality with x holds for all known
individuals.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Work</title>
      <p>In this paper, we have provided some initial definitions and results regarding
the identity problem in DL ontologies, i.e., the question whether the ontology
implies that a given anonymous individual is equal to a known individual. We
have also considered a more involved rôle-based access control scenario where
users can access parts of the ontology depending on their rôle. In a setting where
users can change rôles dynamically, the question is then whether, by changing
rôles and asking queries in these rôles, the user can find out the identity of an
anonymous individual although this may not be possible for a single rôle. We
have shown how to use the identity problem to address this question.</p>
      <p>
        Until now, we have only investigated how to find out whether the identity of
an anonymous individual is disclosed in a certain situation. We have not
considered what to do when this is the case. One possibility would be to additionally
anonymize the available information, e.g., by replacing some of the known
individuals in assertions by new anonymous ones, similar to what is done in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Another direction for future research could be to look at k-anonymity [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
rather than identity. In principle, our identity problem is concerned with
1anonymity, i.e., we want to avoid that one can deduce from the given
information that an anonymous individual belongs to a singleton set consisting of
only one known individual. In many applications, one also wants to ensure that
the set of known individuals to which the anonymous one is known to belong
has a large enough cardinality, i.e., one &gt; k. Of course, in this setting additional
anonymization (as mentioned above) is also relevant in cases where k-anonymity
is not given.
      </p>
      <p>
        Finally, we intend to consider cases where the information about known and
anonymous individuals holds only with a certain probability, e.g., using
ontologies with subjective probability as introduced in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this setting, equality can
also only be derived with a certain probability.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Patel-</surname>
          </string-name>
          Schneider, editors.
          <source>The Description Logic Handbook: Theory</source>
          , Implementation, and
          <string-name>
            <surname>Applications</surname>
          </string-name>
          . Cambridge University Press, New York, NY, USA,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J.</given-names>
            <surname>Biskup</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Bonatti</surname>
          </string-name>
          .
          <article-title>Controlled query evaluation for enforcing confidentiality in complete information systems</article-title>
          .
          <source>Int. J. Inf. Sec.</source>
          ,
          <volume>3</volume>
          (
          <issue>1</issue>
          ):
          <fpage>14</fpage>
          -
          <lpage>27</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>S.</given-names>
            <surname>Cook</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          .
          <source>Logical Foundations of Proof Complexity</source>
          . Cambridge University Press, 1st edition,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          .
          <article-title>Privacy in ontology-based information systems: A pending matter</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>1</volume>
          :
          <fpage>137</fpage>
          -
          <lpage>141</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          and
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Privacy-preserving query answering in logic-based information systems</article-title>
          .
          <source>In In Proc. of the 18th European Conference on Artificial Intelligence</source>
          , pages
          <fpage>40</fpage>
          -
          <lpage>44</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. V.</given-names>
            <surname>Kostylev</surname>
          </string-name>
          .
          <article-title>Logical foundations of privacy-preserving publishing of linked data</article-title>
          .
          <source>In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence</source>
          , pages
          <fpage>943</fpage>
          -
          <lpage>949</lpage>
          . AAAI Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>V.</given-names>
            <surname>Gutiérrez-Basulto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Schröder</surname>
          </string-name>
          .
          <article-title>Probabilistic description logics for subjective uncertainty</article-title>
          .
          <source>J. Artif. Intell. Res. (JAIR)</source>
          ,
          <volume>58</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>66</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>B.</given-names>
            <surname>Hollunder</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          .
          <article-title>Qualifying number restrictions in concept languages</article-title>
          .
          <source>In Proc. of the 2nd Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR'91)</source>
          , pages
          <fpage>335</fpage>
          -
          <lpage>346</lpage>
          ,
          <year>1991</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Sandhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Coyne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. L.</given-names>
            <surname>Feinstein</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Youman</surname>
          </string-name>
          .
          <article-title>Role-based access control models</article-title>
          .
          <source>Computer</source>
          ,
          <volume>29</volume>
          (
          <issue>2</issue>
          ):
          <fpage>38</fpage>
          -
          <lpage>47</lpage>
          , Feb.
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Schaerf</surname>
          </string-name>
          .
          <article-title>Reasoning with individuals in concept languages</article-title>
          .
          <source>Data Knowl. Eng.</source>
          ,
          <volume>13</volume>
          (
          <issue>2</issue>
          ):
          <fpage>141</fpage>
          -
          <lpage>176</lpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>K.</given-names>
            <surname>Schild</surname>
          </string-name>
          .
          <article-title>A correspondence theory for terminological logics: Preliminary report</article-title>
          .
          <source>In Proc. of the 12th Int. Joint Conf. on Artificial Intelligence (IJCAI'91)</source>
          , pages
          <fpage>466</fpage>
          -
          <lpage>471</lpage>
          ,
          <year>1991</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>P.</given-names>
            <surname>Stouppa</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Studer</surname>
          </string-name>
          .
          <article-title>A formal model of data privacy. In I. Virbitskaite and A</article-title>
          . Voronkov, editors,
          <source>Proceedings of Perspectives of System Informatics</source>
          , volume
          <volume>4378</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>401</fpage>
          -
          <lpage>411</lpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>P.</given-names>
            <surname>Stouppa</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Studer</surname>
          </string-name>
          .
          <article-title>Data privacy for ALC knowledge bases</article-title>
          .
          <source>In Proc. of the International Symposium on Logical Foundations of Computer Science</source>
          , volume
          <volume>5407</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>409</fpage>
          -
          <lpage>421</lpage>
          . Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>L.</given-names>
            <surname>Sweeney</surname>
          </string-name>
          .
          <article-title>K-anonymity: A model for protecting privacy</article-title>
          .
          <source>Int. J. Uncertain. Fuzziness Knowl.-Based Syst.</source>
          ,
          <volume>10</volume>
          (
          <issue>5</issue>
          ):
          <fpage>557</fpage>
          -
          <lpage>570</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>S.</given-names>
            <surname>Tobies</surname>
          </string-name>
          .
          <article-title>The complexity of reasoning with cardinality restrictions and nominals in expressive description logics</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          ,
          <volume>12</volume>
          :
          <fpage>199</fpage>
          -
          <lpage>217</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>S.</given-names>
            <surname>Tobies</surname>
          </string-name>
          .
          <article-title>Complexity results and practical algorithms for logics in knowledge representation. CoRR, cs</article-title>
          .
          <source>LO/0106031</source>
          ,
          <year>2001</year>
          .
          <source>PhD thesis</source>
          , RWTH Aachen.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>D.</given-names>
            <surname>Toman</surname>
          </string-name>
          and
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Weddell</surname>
          </string-name>
          .
          <article-title>Conjunctive query answering in CF Dnc : A PTIME description logic with functional constraints and disjointness</article-title>
          .
          <source>In Proc. of the 26th Australasian Joint Conference on Advances in Artificial Intelligence</source>
          , volume
          <volume>8272</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>350</fpage>
          -
          <lpage>361</lpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>