=Paper= {{Paper |id=Vol-2954/paper-6 |storemode=property |title=Tractable Compliance Checking with Negation |pdfUrl=https://ceur-ws.org/Vol-2954/paper-6.pdf |volume=Vol-2954 |authors=Piero A. Bonatti,Luigi Sauro |dblpUrl=https://dblp.org/rec/conf/dlog/BonattiS21 }} ==Tractable Compliance Checking with Negation== https://ceur-ws.org/Vol-2954/paper-6.pdf
 Tractable Compliance Checking with Negation?

    Piero A. Bonatti1[0000−0003−1436−5660] and Luigi Sauro1[0000−0001−6056−0868]

                     Università degli Studi di Napoli “Federico II”
                            {pab,luigi.sauro}@unina.it



        Abstract. PL is a low-complexity fragment of OWL2, expressly de-
        signed to automatically check the compliance of privacy policies with
        respect to the consent of data subjects and the regulations on personal
        data processing, such as the GDPR. Privacy policies, consent and regu-
        lations can all be expressed uniformly with PL classes, and compliance
        checking is reduced to subsumption checking. The latter is tractable and
        highly scalable to meet the performance requirements of some applica-
        tions. Several use cases need negation in the queries; however, subsump-
        tion checking becomes coNP-hard, even if negation is restricted to atomic
        concepts. In this paper we show that an alternative encoding based on
        histories of positive and negative policies (where negation is applied to
        compound concepts) can be processed in polynomial time by a correct
        and complete algorithm.

        Keywords: GDPR compliance checking · Tractable DL · Negation.


1     Introduction

PL is a recently introduced fragment of SROIQ (D) that is being successfully
used to encode privacy policies, the consent of data subjects, and objective parts
of the GDPR [4, 5]. It is also being used to encode licenses formulated with a
profile of ODRL1 and with JSON; well-defined translations from these formats
to PL provide the formal semantics of licenses. The main goal of PL is enabling
automated compliance checks for data economies. Many companies are interested
in such tools, that help in preventing sanctions and reputation loss.
    PL supports simple ontologies that define privacy-related and licensing-
related vocabularies.2 Policies, consent, regulations, and licenses are encoded
as classes; compliance checking is reduced to subsumption checking over such
classes.
    PL balances expressiveness with usability and scalability. In some use cases,
it should be possible to execute a few thousands compliance checks per second;
?
  This work is funded by the European Union with grant n. 883464. Copyright c 2021
  for this paper by its authors. Use permitted under Creative Commons License At-
  tribution 4.0 International (CC BY 4.0).
1
  https://w3c.github.io/market-data-odrl-profile/md-odrl-profile.html
2
  Currently, the reference vocabularies for privacy are those defined by the DPVCG
  group of the W3C, see https://www.w3.org/community/dpvcg/.
2         P. A. Bonatti and L. Sauro

experiments show that a specialized engine implemented in Java can check the
compliance of typical privacy policies with respect to the available consent in
about 400-500 µ-seconds [5]. Concerning usability, SPECIAL’s use case partners
have successfully verified that their employees can write correct privacy policies.
   The use cases of the H2020 project SPECIAL3 – that developed the first
version of PL – and its ongoing follow-up TRAPEZE4 call for extensions of PL.
Here we focus on negation, that is also needed in the licensing domain. Negation
has three main applications in these contexts:
    – expressing exceptions to a general policy/license; the resulting policies are
      more compact and readable;
    – supporting the right of the data subjects to withdraw or restrict their consent
      to data processing;
    – supporting dynamic consent [8].
In dynamic consent, users are asked for consent immediately before data are
collected. Users may reply within a specified timeout, after which consent is
implicitly assumed. Users may deny consent either within the timeout or later,
by revising such implicit consent statements. In all cases, it is useful to formalize
statements such as “location can be tracked in Naples, but not in Via Roma”.
    This paper investigates how to introduce SROIQ’s complement operator ‘¬’
in PL without affecting its tractability and scalability. The main contributions
are the following:
 1. we prove that the simplest way of introducing negation in PL makes rea-
    soning coNP-hard;
 2. then we introduce a notion of history that addresses the above use-case
    requirements and can be processed in polynomial time.
The paper is organized as follows: In the next section, we recall the basic defi-
nitions and properties of PL, illustrate it with some examples, and list related
complexity results. Section 3 proves that PL with atomic negation in the queries
is intractable. Section 4 introduces histories and proves that compliance checking
based on histories is tractable. Section 5 reports our conclusions.


2      Preliminaries and Related Work
We assume the reader to be familiar with the syntax and the model-theoretic
semantics of description logics, and refer to [3] for all definitions. As usual, if I
is an interpretation, then we denote its domain and its interpretation function
with ∆I and ·I , respectively.
    The axioms supported by PL knowledge bases are those illustrated in Table 1.
PL queries are inclusions C v D where C and D are full PL concepts, that is,
unions of the form:
                                  C1 t . . . t Cn
3
    https://specialprivacy.ercim.eu/
4
    https://trapeze-project.eu/
                                    Tractable Compliance Checking with Negation    3

                      PL axiom α I |= α iff:
                         AvB         AI ⊆ B I
                       disj(A, B)    AI ∩ B I = ∅
                      range(R, A) (x, y) ∈ RI → y ∈ AI
                         func(R)     (x, y) ∈ RI ∧ (x, z) ∈ RI → y = z

Table 1. Axioms for PL knowledge bases. Here A and B range over concept names
while R is a role name.



where each Ci is a simple PL concept. Such concepts are defined by the following
grammar:
                       C ::= A | ∃R.C | ∃F.[l, u] | C u C
where A is a concept name, R is a role name, F is an integer-valued concrete
feature (i.e. a data property, in OWL2 terminology), and l, u are integers. The
meaning of ∃F.[l, u] is:

              (∃F.[l, u])I = d ∈ ∆I | ∃i ∈ Z.(d, i) ∈ F I ∧ l ≤ i ≤ u
                            


(recall that for all integer-valued concrete features F and all interpretations I,
F I ⊆ ∆I × Z).


2.1   Examples
A company – call it BeFit – sells a wearable fitness appliance and wants to:
(i) process biometric data (stored in the EU) for sending health-related advice
to its customers, and (ii) share the customer’s location data with their friends.
Location data are kept for a minimum of one year but no longer than 5; biometric
data are kept for an unspecified amount of time. In order to carry out these
two data processing activities legally, BeFit needs consent from its European
customers. Formally, the encoding of BeFit’s privacy policy P in PL looks as
follows:
         (∃has purpose.FitnessRecommendation u ∃has data.BiometricData u
             ∃has processing.Analytics u ∃has recipient.BeFit u
             ∃has storage.∃has location.EU)
         t                                                                        (1)
         (∃has purpose.SocialNetworking u ∃has data.LocationData u
             ∃has processing.Transfer u ∃has recipient.DataSubjFriends u
             ∃has storage.∃has duration.[y1 , y5 ]) .

Note that P is the union of two sub-concepts that formalize operations (i) and
(ii). Here y1 and y5 stand for the integer representation of one year and five
years, respectively.
4        P. A. Bonatti and L. Sauro

   A consent policy C, that specifies the class of operations that a data subject
permits, is encoded in a similar fashion. Then, P complies with C iff K |= P v C,
where K is the knowledge base that defines the terms used in the policies. For
example, the consent policy of a user that opts out (ii), and opts in (i) only for
anonimized data, can be encoded as follows:

          (∃has purpose.FitnessRecommendation u ∃has data.BiometricData u
            ∃has processing.Analytics u ∃has recipient.BeFit u                   (2)
            ∃has storage.∃has location.EU) .

Clearly, BeFit’s privacy policy P complies with this consent policy. As a second
example, suppose that the data subject allows analytics on anonimized biometric
data. This policy can be formalized as follows:

                       (∃has processing.Analytics u
                                                                                 (3)
                       ∃has data.(BiometricData u Anonymized)) .

In this case, P does not comply with the consent policy.5
    The knowledge base K typically contains both domain-specific axioms and
domain-independent axioms, that represent general properties of the business
policies. The latter axioms specify, for instance, that the primary roles has purpose,
has data, has processing, and has storage are functional, and range over suitable,
pairwise-disjoint concepts:

                            range(has purpose, AnyPurpose)
                            range(has data, AnyData)
                            disj(AnyPurpose, AnyData)
                             ...

Domain-specific axioms can be used to instantiate the compliance framework in
a particular application domain. For example, in our scenario, domain-specific
data categories and processing can be introduced with inclusions like:

                               HeartRate v BiometricData ,
                             ComputeAvg v Analytics .

These two axioms, together with the above consent, allow BeFit to compute
the average heart rate of the data subject in order to send her fitness recom-
mendations. Other typical axioms specify which classes are mutually disjoint, as
in:
                           disj(BiometricData, LocationData) .
   Note that negation is not supported in this language, so there is no explicit
way of denying consent, nor is it possible to introduce exceptions to a general
consent. Two alternative ways of introducing negation in PL’s representation of
policies will be discussed in sections 3 and 4.
5
    We are assuming that K does not entail BiometricData v Anonymized.
                                Tractable Compliance Checking with Negation              5

2.2    Interval Safety and Convexity
In this section we report some definitions and results from [4] that will be useful
in Section 4.
Definition 1. A PL query C v D is interval safe if, for all constraints ∃f.[l, u]
occurring in C and all ∃f.[l0 , u0 ] occurring in D, either [l, u] ⊆ [l0 , u0 ] or [l, u] ∩
[l0 , u0 ] = ∅ .
Every PL query C v D is equivalent to an interval safe inclusion C ∗ v D, where
|= C ∗ ≡ C [4, Prop. 5]. Informally speaking, when subsumptions are interval
safe, then they behave as in a convex logic. Formally, interval safe queries enjoy
the following property, that follows from lemmas 1 and 2 of [4], combined with
Propositions 1 of the same paper:
Lemma 1. Let K be a PL knowledge base and let C be a simple PL concept
such that K 6|= C v ⊥. There exist an interpretation I and an individual d ∈ ∆I
such that:
    a) I |= K ;
    b) d ∈ C I ;
    c) for all interval safe PL queries C v D, d ∈ DI iff K |= C v D .
We will call such pairs (I, d) canonical models of K and C. The above lemma
implies the following convexity property:
Proposition 1. for all PL knowledge bases K, and all interval safe PL queries
q = (C v D1 t . . . t Dn ) where C is simple, K |= q iff there exists i such that
K |= C v Di (1 ≤ i ≤ n).

2.3    Complexity Results and Related Work
Deciding whether a full PL concept is inconsistent with respect to a PL knowl-
edge base is in P. Deciding whether a PL query is a logical consequence of a PL
knowledge base is in general coNP-complete. It is in P if, for all simple PL con-
cepts Ci occurring on the left-hand side of the subsumption query, the number
of concrete features occurring in Ci is bounded by a constant c. In SPECIAL
and TRAPEZE’s privacy policies, c = 1. With this bound, a sequential Java
implementation can reach over 2000 compliance checks per second, thereby ad-
dressing the scalability requirements posed by the use case partners of SPECIAL
and TRAPEZE [5].
    The above complexity results – that have been proved in [4] – have been
extended in [5] to knowledge bases K ∪ O where K is a PL knowledge base and
O is a Horn-SRIQ knowledge base, under the following assumptions:
 1. O belongs to a tractable fragment of Horn-SRIQ;
 2. none of the roles occurring in O occurs in either K or the given PL query.6
6
    With suitable syntactic sugar, the roles defined in O may be used in the queries,
    provided that the roles defined in K do not occur within the scope of those defined
    in O [5, Remark 5.1].
6        P. A. Bonatti and L. Sauro

    Note that PL knowledge bases are in OWL2-RL, a Horn fragment of OWL2;
they are also in the extensions of EL and DL-lite H        horn with functional roles.
Answering PL subsumption queries is equivalent to solving query containment
problems with respect to PL knowledge bases, where the queries are the trans-
lation of full PL concepts into formulae of first-order logic. Such formulae are
instances of the class of queries investigated in [9], called extended faceted queries.
More precisely, the queries derived from full PL concepts are unions of conjunc-
tive queries with comparison operators (in our case only ‘≤’, that is needed to
express interval membership). In the following sections, such queries will be ex-
tended with negation.7 The comparison operator makes the usual reduction of
query containment to query answering impossible, so the literature on answer-
ing unions of conjunctive faceted queries is not directly applicable to our case.
For more details – and further tractability and intractability results concerning
faceted queries – see [5, Sec. 5.3].
    In the extension of EL with functional roles, subsumption checking is com-
plete for EXPTIME, in general [2]. A tractability result for empty TBoxes is
reported in [6, Fig. 4]; however, in the same paper, it is proved that even with
acyclic TBoxes, subsumption is coNP-complete. The tractability of an extension
of EL with non-convex concrete domains (like intervals) has been proved in [6],
under the assumption that the TBox is a set of definitions of the form A ≡ C,
where each A is a concept name and appears in the left-hand side of at most
one definition.
    If DL-liteH
              horn is extended with functional roles, then the data complexity of
query answering raises at the first level of the polynomial hierarchy and knowl-
edge base satisfiability becomes EXPTIME-complete [1].
    Finally, a result relevant to the extension of queries with negation discussed in
the following sections. Rosati [7] proved that answering conjunctive queries with
safe negation in DL-lite is coNP-hard. The proof makes use of a knowledge base
that contains negation and inverse roles. In our setting, queries are intractable
even when the knowledge base is empty.


3     Intractability of PL with Atomic Negation

With reference to the example illustrated in Section 2.1, assume that a Be-
Fit customer wants to restrict her consent, by precluding that locations in her
homeplace (say Rome) are shared with friends. A simple way of introducing this
restriction in (1) consists in replacing the expression ∃has data.LocationData with

                 ∃has data.(LocationData u ¬RomeLocationData) ,

where the term RomeLocationData is axiomatized in the knowledge base K with
the inclusion:
                    RomeLocationData v LocationData .
7
    We will actually work on the subsumption problems that correspond to containment
    problems involving unions of conjunctive queries with ‘≤’ and negation.
                               Tractable Compliance Checking with Negation           7

This encoding extends PL queries with atomic negation (that is, negation ap-
plied to concept names). Our use cases do not need negation in PL knowledge
bases.
    Subsumption checking over full PL concepts with atomic negation can be
reformulated as query containment over unions of conjunctive queries with safe
negation. Containment of conjunctive queries with negation is Π2p -complete [10],
so the intractability of PL with atomic negation in the queries does not come
as a surprise.
    Here we show that subsumption checking remains intractable (coNP-hard)
even if atomic negation is used as in the above example verbatim, that is, in
existential restrictions of the form
                                    ∃R.(A u ¬B)                                    (4)
where A, B are concept names. The reduction is fromWn the validity of 3-DNF
formulae. Given such a propositional formula φ = i=1 (Li,1 ∧ Li,2 ∧ Li,3 ), its
reduction is:
                  n
                  G                                                   
          ∃R.A v     ∃R.(A u Li,1 ) u ∃R.(A u Li,2 ) u ∃R.(A u Li,3 )      (5)
                    i=1

where R is any role name and A is a concept name that does not occur in any
Li,j . It is not hard to see that φ is valid iff the above inclusion is valid, and that
the reduction can be computed in polynomial time (actually, logarithmic space),
so the following theorem holds:
Theorem 1. Subsumption checking in PL with atomic negation is coNP-hard
even if the knowledge base is empty and negation is restricted to concepts of the
form (4).

4    Histories
Our use cases can also be addressed by modeling the sequence of consent and
denial statements issued by the data subject (either through the dynamic consent
system or through the dashboards for personal data control). In formal terms,
each such statement is a simple policy with a sign, that specifies whether the
operations described by the policy are permitted or forbidden. In case of conflicts,
older statements are overridden by more recent statements.
Definition 2 (History). A history is a finite sequence +C1 ±C2 . . .±Cn where
each Ci is a simple PL concept.
Histories are just a handy way of representing compound concepts with negation,
as shown by the following inductive translation tr, where H is a history:
                                 history tr(history)
                                 +C        C
                                 H+C     tr(H) t C
                                 H−C     tr(H) u ¬C
8        P. A. Bonatti and L. Sauro

Hereafter, we will identify each history H with the corresponding concept tr(H).
Example 1. Consider the example in Section 3 and let H be the history +D1 +
D2 − D3 , where D1 and D2 are the two arguments of the union in (1) and D3 is
the concept:
                       ∃has data.(RomeLocationData) .
By the distributivity of u on t, H is equivalent to (D1 u ¬D3 ) t (D2 u ¬D3 ).
Note that D1 u ¬D3 is equivalent to D1 , because D1 and D3 are disjoint.
This follows from three facts: (i) LocationData and BiometricData are disjoint,
(ii) RomeLocationData is a subclass of LocationData, and (iii) has data is func-
tional. Concerning D2 u ¬D3 , by (iii) we have that the concept

            ∃has data.(LocationData) u ¬∃has data.(RomeLocationData)

is equivalent to

                   ∃has data.(LocationData u ¬RomeLocationData) .                     (6)

Therefore, D2 u¬D3 is equivalent to the concept D20 obtained from D2 by replac-
ing ∃has data.(LocationData) with (6). Summarizing, H is equivalent to D1 tD20 ,
therefore it represents the restricted consent of the customer correctly.    t
                                                                             u
     We are going to prove that the entailments of the form

                                       K |= C v H                                     (7)

where K is a PL knowledge base, C is a full PL concept, and H is a history,
can be decided in polynomial time. For this purpose, we need a few preliminary
definitions and lemmas. First, we normalize H with the following procedure8 :
Definition 3 (Normalization norm(H)). Let H = +C1 ± C2 . . . ± Cn be a
history, K be a knowledge base, and C be a concept. The normalization of H
w.r.t. K and C, denoted by norm K,C (H), is obtained with the following three
steps:
    a) replace each −Ci in H with −(Ci u C); the result of this transformation
       will be denoted by H ↓ C;
    b) remove from H ↓ C all elements ±D such that K |= D v ⊥ ;
    c) remove from the resulting history all the elements ±D followed by some
       element ±D0 such that K |= D v D0 .
The subscripts K and C in norm K,C (H) will be omitted when clear from context.
The above normalization preserves the entailments of the form (7), thus it can
be exploited to answer subsumption queries involving histories:
8
    Note that the normalization procedure in Definition 3 has been introduced only for
    the purpose of simplifying proofs. Hence, it will be not part of the actual entailment
    procedure described in Algorithm 1.
                             Tractable Compliance Checking with Negation        9

Lemma 2. For all PL knowledge bases K, full PL concepts C, and histories
H,

                      K |= C v H iff K |= C v norm(H) .

   The proof of the above lemma is based on the next lemma:

Lemma 3. Let K be a PL knowledge base, let C be a full PL concept, and let
H = +C1 ± . . . ± Cn be a history. Then the following propositions hold:

  a) |= H u C ≡ (H ↓ C) u C ;
  b) if K |= D v ⊥ then K |= H±D ≡ H ;
  c) if H = H1 ±Ci H2 ±Cn and K |= Ci v Cn , then K |= H ≡ H1 H2 ±Cn .

Proof. (Sketch) We prove Lemma 3.a) by induction on the length of H. In the
base case, H = +D, so H u C and (H ↓ C) u C are syntactically the same.
    Let H = H 0 ±D. If D occurs positively, we have by distributivity that H uC is
equivalent to (H 0 uC)+(DuC). Moreover, by construction H ↓ C = (H 0 ↓ C)+D
and hence (H ↓ C) u C is equivalent to ((H 0 ↓ C) u C) + (D u C). Then, the
assertion follows from the fact that, by (IH), H 0 u C and (H 0 ↓ C) u C are
equivalent. An analogous proof for the negative case is left to reader.
    Lemma 3.b) is trivial, so we focus on point c). If Ci and Cn occur posi-
tively in Lemma 3.c) then H1 H2 +Cn is always subsumed by H1 +Ci H2 +Cn .
The other way round directly follows from the fact that H1 +Ci H2 is subsumed
by H1 H2 +Ci and K |= Ci v Cn . Analogously, we can prove the negative case
by inverting the directions of subsumptions. Next, assume that Ci is negative
and Cn positive. Since K |= Ci v Cn , the difference between the extensions
of H1 − Ci H2 and H1 H2 is always contained in the extension of Cn . It follows
that H1 − Ci H2 + Cn ≡ H1 H2 + Cn , so point c) holds. The case in which Ci is
positive and Cn negative is similar. There are no further cases, so the proof is
complete.                                                                       t
                                                                                u

Proof of Lemma 2. (Sketch) It suffices to show that each of the three normaliza-
tion steps preserves (7). First observe that:
   K |= C v H iff K |= C v H u C
              iff K |= C v (H ↓ C) u C (by Lemma 3.a)
              iff K |= C v H ↓ C ,
therefore step a) of Definition 3 preserves (7). Next, let H 0 be the result of
applying normalization step b) to H ↓ C. By a simple induction on the length of
H ↓ C, based on Lemma 3.b, it can be shown that K |= H ↓ C ≡ H 0 , therefore

                         K |= C v H iff K |= C v H 0 .

Similarly, it can be proved that the last step preserves (7) by a simple induction
on the length of H 0 based on Lemma 3.c. The details are left to the reader. t   u
10        P. A. Bonatti and L. Sauro

      The next two lemmas need the following definitions:

                         pos(H) = {Ci | +Ci occurs in H} ,
                         neg(H) = {Ci | −Ci occurs in H} .

Lemma 4. Let C v H be any interval safe inclusion where C is a simple PL
concept. If neg(norm K,C (H)) 6= ∅ then K 6|= C v norm K,C (H) .

Proof. Assume that neg(norm(H)) 6= ∅. Then norm(H) = H 0 −(Ci u C)H 00 ,
where K 6|= Ci u C v ⊥ (otherwise the negative element would have been elim-
inated in phase b) of the normalization). By Lemma 1, there exists a canonical
model (I, d) of K and Ci u C. Points a) and b) of Lemma 1 imply that I |= K
and
                                d ∈ (Ci u C)I                              (8)
therefore d 6∈ (H 0 −(Ci u C))I . Moreover, for all elements +Cj in H 00 , we have

                                   K 6|= Ci u C v Cj

otherwise −(Ci uC) would have been eliminated in phase c) of the normalization.
Then Lemma 1.c implies that for all +Cj in H 00 ,

                                       d 6∈ CjI .

It follows that d 6∈ norm(H)I . Moreover, d ∈ C I , by (8), so K 6|= C v norm(H).
                                                                               t
                                                                               u
                                                                       F
Proposition 2. For all histories H, if K |= C v H then K |= C v pos(H) .

Proof. ItFcan be proved, by a straightforward induction on the length of H, that
|= H v pos(H). The proposition immediately follows.                           t
                                                                              u

      We are finally ready for the main result:

Theorem 2. Let K be a PL knowledge base, C be a simple PL concept, and H
be a history such that C v H is interval safe. Then K |= C v norm(H) iff the
following two conditions hold:
     a) neg(norm(H)) = ∅, and
     b) there exists D ∈ pos(H) = pos(norm(H)) such that K |= C v D.

Proof. (⇐)F Suppose that a) and b) hold. By b), K |= C v D, and clearly
|= D v pos(H), because      F D is one of the arguments of the union. Moreover,
by a), we have that H = pos(H). It follows immediately that K |= C v H.
    (⇒) We prove the contrapositive. Accordingly, suppose that either a) or b) is
false. In the former case, by Lemma 4, K 6|= C
                                             F v norm(H) and the theorem holds.
If b) is false, then by convexity K 6|= C v pos(H), and by the contrapositive
of Proposition 2, K 6|= C v norm(H).                                           t
                                                                               u

      As a corollary of the above theorem, we obtain the desired tractability result.
                              Tractable Compliance Checking with Negation          11

Corollary 1. Let c be an arbitrary but fixed non-negative integer. For all PL
knowledge bases K and all subsumption queries q = (C1 t . . . t Cn v H) such
that
1. H is a history ,
2. each Ci is a simple PL concept (i.e. their union is a full PL concept) ,
3. for all i = 1, . . . , n, the number of subconcepts of the form ∃f.[l, u] occurring
   in Ci is bounded by c ,
the entailment K |= q can be decided in polynomial time.
Proof. Clearly, K |= q holds iff for all i = 1, . . . , n, K |= Ci v H holds. Thus, it
suffices to prove that each of these n entailments can be decided in polynomial
time. Due to the bound c on the number of intervals, each of these subsumption
can be transformed in polynomial time into an equivalent, interval safe inclusion
Ci∗ v H, as shown in [4, Proposition 5]. By Lemma 2, the resulting subsumption
tests can equivalently be replaced by
                               K |= Ci∗ v norm(H) .
Now Theorem 2 can be applied to verify the above subsumptions.
   The test a) in Theorem 2 takes time O(|norm(H)|) = O(|H|). Test b) requires
answering at most |H| PL queries, each of which takes polynomial time [4,
Proposition 6]. So the overall cost of testing the conditions of Theorem 2 is
polynomial. We are only left to show that computing norm(H) takes polynomial
time. Phase a) takes time |C| · |H|. Phase b) requires at most |H ↓ C| concept
consistency tests, each of which takes polynomial time [4, Proposition 8]. Finally,
phase c) requires answering at most |H ↓ C| PL subsumption queries, each of
which takes polynomial time. Thus, the overall computation of norm(H) takes
polynomial time.                                                                 t
                                                                                 u
Remark 1. Note that the proof of tractability relies only on the following prop-
erties of PL:
 1. the convexity of interval safe PL (Lemma 1 and Proposition 1), that is
    needed in Theorem 2;
 2. the tractability of subsumption query answering in PL with bounded occur-
    rences of intervals.
Therefore, our results apply to all tasks of the form (7) such that K, C, and the
concepts occurring in the history belong to a convex, tractable logic (say, any
Horn description logic with a tractable subsumption problem).
    In practice, it is not necessary to compute norm(H) explicitly. Consider Al-
gorithm 1, for example. Lines 2–4 check condition b) of Theorem 2. In particular,
if the computation reaches line 5, then Ck is the rightmost concept D in H sat-
isfying condition b). The “for” loop at line 5 looks for the elements −(Cj u C) of
H ↓ C that survive phase b) of the normalization. The loop in lines 7–9 checks
whether −(Cj u C) survives phase c), too. Note that −(Cj u C) is looked for
after +Ck , because all the negative elements of H ↓ C on the left of +Ck are
eliminated during normalization phase c), due to +Ck itself. Thus, Algorithm 1
returns true iff conditions a) and b) of Theorem 2 hold.
12        P. A. Bonatti and L. Sauro


    Algorithm 1: PL Subsumption with histories
     Input: K, C, H
       where C is a simplePL concept, H = ±C1 . . . ±Cn , and C v H is interval safe
     Output: true if K |= C v H, false otherwise
 1 if K |= C v ⊥ then return true
   // Condition b)
 2 k := n
                                      0
 3 while k > 0 and (sign(±Ck ) 6= ‘+ or K 6|= C v Ck ) do k := k − 1
 4 if k = 0 then return false // else Ck ∈ pos(H) and K |= C v Ck
   // Condition a)
 5 for i := k + 1 to n do
 6     if sign(±Ci ) = ‘−0 and K 6|= Ci u C v ⊥ then
 7         j := i + 1
 8         while j ≤ n and (sign(±Cj ) 6= ‘+0 or K 6|= Ci u C v Cj ) do
 9             j := j + 1
10           if j > n then return false

11   return true



5      Conclusions
Summarizing, PL queries need to be extended with negation, in order to address
the use cases related to automated compliance with the GDPR. Atomic negation
raises the complexity of subsumption checking (on which compliance checking is
based) at least to the first level of the polynomial hierarchy. This holds even if
negation is further restricted to the specific concept expressions that arise in our
reference examples. We have tackled this issue by introducing histories of positive
and negated PL concepts. Histories represent in a natural way the consent of
data subjects and can be processed in polynomial time with Algorithm 1.
    The running time of this algorithm grows quadratically with the length of the
history, which is likely to be incompatible with the scalability requirements for
PL. Thus, in a forthcoming work, we are going to introduce suitable optimiza-
tions. Points b) and c) of Lemma 3 already show how to eliminate redundant
elements from a history. Further optimizations may partition histories into a
set of disjoint, shorter histories. We are also going to explore compact policy
representations, where the common parts of history elements are factorized.

References
 1. A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev. The DL-Lite
    family and relations. J. Artif. Intell. Res., 36:1–69, 2009.
 2. F. Baader, S. Brandt, and C. Lutz. Pushing the EL envelope. In IJCAI-05, pages
    364–369. Professional Book Center, 2005.
 3. F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider,
    editors. The Description Logic Handbook: Theory, Implementation, and Applica-
    tions. Cambridge University Press, 2003.
                                 Tractable Compliance Checking with Negation             13

 4. P. A. Bonatti. Fast compliance checking in an OWL2 fragment. In J. Lang, editor,
    Proceedings of the Twenty-Seventh International Joint Conference on Artificial
    Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, pages 1746–1752.
    ijcai.org, 2018.
 5. P. A. Bonatti, L. Ioffredo, I. M. Petrova, L. Sauro, and I. S. R. Siahaan. Real-time
    reasoning in OWL2 for GDPR compliance. Artif. Intell., 289:103389, 2020.
 6. C. Haase and C. Lutz. Complexity of subsumption in the EL family of description
    logics: Acyclic and cyclic tboxes. In ECAI 2008 - 18th European Conference on
    Artificial Intelligence, Proceedings, volume 178 of Frontiers in Artificial Intelligence
    and Applications, pages 25–29. IOS Press, 2008.
 7. R. Rosati. The limits of querying ontologies. In T. Schwentick and D. Suciu,
    editors, Database Theory - ICDT 2007, 11th International Conference, Barcelona,
    Spain, January 10-12, 2007, Proceedings, volume 4353 of Lecture Notes in Com-
    puter Science, pages 164–178. Springer, 2007.
 8. E. Schlehahn and R. Wenning. Legal requirements for a privacy-enhancing big
    data V2. SPECIAL deliverable D1.6, April 2018. Available through https://
    specialprivacy.ercim.eu/.
 9. E. Sherkhonov, B. Cuenca Grau, E. Kharlamov, and E. V. Kostylev. Semantic
    faceted search with aggregation and recursion. In The Semantic Web - ISWC
    2017 - 16th International Semantic Web Conference, Vienna, Austria, October
    21-25, 2017, Proceedings, Part I, pages 594–610, 2017.
10. J. D. Ullman. Information integration using logical views. In F. N. Afrati and
    P. G. Kolaitis, editors, Database Theory - ICDT ’97, 6th International Conference,
    Delphi, Greece, January 8-10, 1997, Proceedings, volume 1186 of Lecture Notes in
    Computer Science, pages 19–40. Springer, 1997.