Patent Valuation Using Difference in ALEN

                         Naouel Karam and Adrian Paschke

      Department of Mathematics and Computer Science, Freie Universität Berlin
                  Königin-Luise-Str. 24-26, 14195 Berlin, Germany
             naouel.karam@fu-berlin.de,paschke@inf.fu-berlin.de


        Abstract. In this paper we present an approach for patent claim com-
        parison based on the difference operator in description logics. The claims
        are represented using ALEN . An algorithm computing the difference be-
        tween two ALEN concept descriptions is proposed and its usefulness for
        patent application valuation is described by means of some simple ex-
        amples.


1     Introduction

More and more, patent material is made available in electronic format, and most
of the time in textual form. Examples of such services are the European patent
Office (EPO) [1] and the United States Patent and Trademark Office (USPTO)
[3]. The need for a formal specification of the patent content arose recently to
help retrieval, examination and classification of patent applications.
    In this context one needs to evaluate the innovative degree of a patent appli-
cation. A claim is the part of a patent application where the inventor specifies
the invention attributes and its features, defining what can be protected by the
patent law. The aim is to show the non-obviousness to a person with basic do-
main skills and the novelty of the application compared to the state of the art.
    Patent claims are described in natural language. Terms used in the claims
must find clear support in the preceding description part of the patent appli-
cation such that the meaning of a term is clearly specified. Previous work had
attempted to represent patent material using OWL ontologies [7, 15].
    Statistical tools performing patent application evaluation are based on a set
of parameters provided by the user. Based on those parameters, the overall score
of a patent application is calculated.
    In this paper we present a more formal approach for patent claim comparison
based on description logics1 . Patent application claims are formally represented
as concept descriptions. When comparing two applications, the differences need
to be pinpointed in order to prove the novelty. We compute this difference by
using the difference operator and return the result to the user. The result must
be intuitive and easy to understand in order to help the user construct his
argumentation.
1
    This work has been partially supported by the Fact Screening and Transformation
    Project (FSTP) funded by the Teles Pri AG: www.fstp-expert-system.com
    The difference operator computes the part of a concept that is not contained
into another one. The difference operator has been defined in [14] as being the
most general description that can be added to the second operand to obtain
equivalence with the first. A second definition of the difference involving a syn-
tactic criteria for minimality has been introduced in [4].
    In our context we need to describe existential, universal and numerical re-
strictions. The difference operator has been investigated for description logics al-
lowing either numerical restrictions [12] or existential restrictions [4]. We propose
an algorithm for ALEN based on the structural characterization of subsumption
for ALEN [10, 11].
    The paper is organized as follows: in section 2, we give the formal definitions
of the difference operator and discuss the motivation for our choice. Section 3
recalls useful results for ALEN and introduces some notations. In section 4 we
provide an algorithm for computing the difference in ALEN . Some examples
of patent claims comparison are given in section 5 and Section 6 concludes the
paper.


2   Operator Choice

The need for providing parts of a description that are not contained in another
one has been motivated by different applications and different definitions exist
with their advantages and drawbacks.
    Informally speaking, the difference between two concept descriptions is the
information contained in the first description and not in the second. The dif-
ference operator allows to remove from a given description all the information
contained in another description. The difference operation between two concept
descriptions was first introduced by Teege [14].

Definition 1. (semantic difference) The difference between two concept descrip-
tions C and D with C v D is given by

                         C − D := max{E | E u D ≡ C}

where max is defined with respect to subsumption.

    Later, the work in [9, 4] proposed a refinement of this definition by allow-
ing the difference between incomparable descriptions (i.e. D is not required to
subsume C) and taking the syntactic minimum (w.r.t. a subdescription ordering
d ) instead of a semantic maximum.

Definition 2. (syntactic difference) The difference between two incomparable
concept descriptions C and D is defined as

                       C − D := min{E | E u D ≡ C u D}
where min is defined with respect to a so-called subdescription ordering (de-
noted by d ), used to compare syntactical structures that we recall in the next
definition.

Definition 3. (Subdescriptions) Let C be an ALEN -concept description, C
                                                                       b is
                        b d C) iff,
a subdescription of C ( C

1. Cb = ⊥, or
2. Cb is obtained by removing from the top-level of C: >, concept names, num-
   ber restrictions, value restrictions, existential restrictions. For the remaining
   restrictions ∀r.E or ∃r.E, replace the concept description E with a subde-
   scription of E.

   As an example, consider the ALEN -concept description C:

                       P u P u ∀r.P u ∃r.(P u ∃r.Q)u ≤ 3r

   A possible subdescription:

                        b = P u ∀r.P u ∃r.(∃r.Q)u ≤ 3r
                        C

    is obtained from C by removing P from the top-level of C and P from the
subexpression ∃r.(P u ∃r.Q). Note that C  b is equivalent to C and there exists no
subdescription of C b owning this property. Cb is the minimal subdescription which
is equivalent to C.
    In [12], the authors defined a semantic difference for ALN . They also compare
the semantic and syntactic variants and state that the two approaches differ
mainly on the bottom handling. Indeed, the bottom decomposition leads to more
than one result for the semantic difference. On the other hand, the syntactic
difference always generates a unique result that is a subdescription of the input
description and hence more intuitive and comprehensive for the user.
    Another operator called “concept abduction“ has been introduced to provide
an explanation for matchmaking results in an e-marketplace. Abduction was first
defined for ALN [5] and has been recently extended to a more expressive DL
SH, using tableau calculus [13].
    This operator does not require subsumption between the concept descriptions
neither and thus is more general than the semantic difference. As stated in [6],
the result of the difference is included in the set of solutions of a CAP.
    In our context, we chose to use the syntactic difference for three reasons:

1. It does not require subsumption between the concept descriptions being
   compared, which is more realistic in real world applications,
2. it generates a unique result which is a syntactic variation of the input and
   hence more intuitive for the user,
3. when the inputs are free from insatisfiable subexpressions the result is equiv-
   alent to the semantic difference.
3   Preliminaries

In this section we recall some results for subsumption in ALEN and define some
notations that will be useful to derive the difference algorithm.
    The normal form of an ALEN concept description is obtained by applying
the following transformation rules:

 – (≥ mr) u (≥ nr) → (≥ nr) if n ≥ m,
 – (≤ mr) u (≤ nr) → (≤ nr) if n ≤ m, and
 – ∀r.C u ∀r.D → ∀r.(C u D).

    To access the different component of a concept description, we will use the
following notations:

 – prim(C) denotes the set of concept names and the bottom concept occurring
   on the top-level of C,
 – infr (C) = (≤ nr) if there exists a number restriction of the form (≤ nr) on
   the top-level of C, infr (C) = > otherwise,
 – supr (C) = (≥ nr) if there exists a number restriction of the form (≥ nr) on
   the top-level of C, supr (C) = > otherwise,
 – minr (C) = max{k | C v (≥ kr)},
 – maxr (C) = min{k | C v (≤ kr)},
 – ∀r (C) = E if there exists a value restriction ∀r.E on the top-level of C;
   ∀r (C) = > otherwise,
 – ∃r (C) = {C 0 | ∃r.C 0 occurs on the top-level of C}.

For the sake of clarity, we will assume that the set NR is reduced to the role r.
The algorithm can be easily generalized to arbitrary sets of role names.
    The difference algorithm is based on the structural characterization of sub-
sumption in ALEN given in [10]. In the sequel we present the intuitions used
to compute induced concept descriptions and recall the structural subsumption
characterization. We also introduce some notations that we will use in the dif-
ference algorithm.
    The main issue when dealing with ALEN is to compute the ”non-trivial”
concept descriptions subsuming a concept description C; such concepts do not
appear in the top-level of C and are called ”induced”. Let us recall this process
by means of examples.
Number restrictions:
They can be induced if two existential restrictions involve disjoint concepts. For
example, if C := ∃r.P u ∃r.¬P then ≥ 2r is induced by C.
Let us note the induced number restrictions as follows: min∗r (C). Note that if
min∗r (C) = k, we have C v≥ kr which means that min∗r (C) is taken into account
in minr (C) defined above.
Existential restrictions:
Due to ≤-number restrictions, if the number of existential restrictions in the top-
level of a concept C is greater than the ≤-number restriction, those existential
restrictions must be merged.
   For example, if we have:

      C := ∃r.(P u A) u ∃r.(P u B) u ∃r.(¬P u A) u ∃r.Q u ∃r.¬Qu ≤ 2r,
    the merging process results in new concepts descriptions where the only con-
sistent ones are:

                ∃r.(P u Q u A u B) u ∃r.(¬P u ¬Q u A) ≤ 2r, and
                ∃r.(P u ¬Q u A u B) u ∃r.(¬P u Q u A) ≤ 2r.
In [10], each merging is represented by a mapping α leading to a concept C α and
the set of mappings is denoted by Γr (C). The mappings in Γr (C) are required
to obey the following conditions:
    α(i) 6= ∅ for all 1 ≤ i ≤ n;
 1. S
 2. d1≤i≤n α(i) = {1, ..., m} and α(i) ∩ α(j) = ∅ for all 1 ≤ i < j ≤ n;
 3. j∈α(i) Cj u ∀r (C) 6≡ ⊥ for all 1 ≤ i ≤ n,
where n := min{maxr (C), | ∃r (C) |} and m :=| ∃r (C) |.
   In our example, this set consists of two mappings α1 and α2 leading to the
two concepts C α1 and C α2 above with C ≡ C α1 t C α2 .
The set of C 0 such that ∃r.C 0 in C α for all mappings is noted as follows:
                                          [
                            ∃∗r (C) :=          ∃r (C α )
                                      α∈Γr (C)
For our example we obtain:
    ∃∗r (C) := {P u Q u A u B, ¬P u ¬Q u A, P u ¬Q u A u B, ¬P u Q u A}
    In case ∃r (C) = ∅ (there is no existential restriction on the top level of C)
one needs to take into account the ≥-restrictions together with value restrictions
to deduce non-trivial existential restrictions. Let us illustrate the case on the
following concept:
                           D := (≥ 2r) u ∀r.(A u B).
The instances of D have at least two r-successors and due to the value restriction
we can deduce that D v ∃r.(A u B).
     This concludes the existential restrictions case.
Value restrictions:
It is clear from the former example that every instance of C has exactly two
r-successors, in that case one can deduce from the existential restrictions in C α1
and C α2 that all r-successors belong to A and thus that the value restriction
∀r.A is induced by C.
     This deduction can be performed only when all r-successors are known and
it is represented by the condition:
                              maxr (C) = min∗r (C).
    The second case where a value restriction can be induced is when maxr (C) =
0 then C v ∀r.⊥.
4      Syntactic Difference between ALEN -Concept
       Descriptions

The algorithm computing the difference between two ALEN -concept descrip-
tions C and D is depicted in figure 1. We extended the algorithm proposed in
[9] to compute the difference between two ALE-concept descriptions.
    Before going into detail about the algorithm, and as it is based on the struc-
tural characterization of the subsumption [10], let us recall the conditions for
each kind of conjunct occurring on the top level of the concept descriptions.

Theorem 1. Let C, D be two ALEN -concept descriptions with
∃r (C) = {C1 , ..., Cn }. Then C v D iff C ≡ ⊥, or D ≡ >, or the following
conditions hold:

1. prim(C) ⊆ prim(D),
2. maxr (C) ≤ maxr (D),
3. minr (C) ≥ minr (D),
4. for all D0 ∈ ∃r (D) it holds that
   (a) ∃r (C) = ∅, minr (C) ≥ 1, and ∀r (C) v D0 ; or
   (b) ∃r (C) 6= ∅, and for each α ∈ Γr (C), there exists C 0 ∈ ∃r (C α ) such that
       C 0 u ∀r (C) v D0 .
5. if ∀r (C) 6≡ >, then
   (a) maxr (C) = 0; or
   (b) minr (C) < maxr (C) and ∀r (C) v ∀r (D); or
   (c) 0 < minr (C) = maxr (C) and ∀r (C) u C 0 v ∀r (D) for all C 0 ∈ ∃∗r (C).


   According to the definition of the difference (c.f Definition 2), the output of
the algorithm must verify:

                         ComputeDiff(C, D) u D ≡ C u D

   Let start with concept names and numerical restrictions. One can easily verify
conditions 1.−3. of Theorem 1:

    • The set of primitive concepts prim(ComputeDiff(C, D) u D) is equal to
      prim(ComputeDiff(C, D)) ∪ prim(D), which by definition is equal to
      (prim(C)\prim(D)) ∪ prim(D). This is again equal to prim(C) ∪ prim(D), the
      set of primitive concepts of (C u D).

    • For maxr (ComputeDiff(C, D) u D), we distinguish two cases: it is equal to
      maxr (D), when maxr (C) ≥ maxr (D) and to maxr (infr (C) u D), when
      maxr (C) < maxr (D). Both cases are equal to maxr (C u D).

    • Analogously to the precedent case, for minr (ComputeDiff(C, D) u D) we have
      two cases: equal to minr (D) when minr (C) ≤ minr (D) and to minr (supr (C)u
      D) when minr (C) > minr (D). Which again are equal to minr (C u D).
 Require: Two ALEN -concept descriptions C and D in ALEN -normal form
 Ensure: ComputeDiff(C, D)
  1: if C u D ≡ ⊥ then
  2:    ComputeDiff(C, D) := ⊥
  3: else
  4:    ComputeDiff(C, D) := uA∈prim(C−D) A u inf     dr (C−D) u supr (C−D) u
        ∀r.ComputeDiff(∀r (C), ∀r (D) u ∀∗r (C u D)) u E∈E 0 ∃r.E,
                                                           r
        where the value restriction is omitted in case ComputeDiff(∀r (C), ∀r (D) u
        ∀∗r (C u D)) ≡ > and:

      •    prim(C-D) :=prim(C)\prim(D);
                           >,        maxr (C) ≥ maxr (D),
      •    infr (C-D) =
                           infr (C), maxr (C) < maxr (D);
                          
                            >,        minr (C) ≤ minr (D),
      •    supr (C-D) =
                            supr (C), minr (C) > minr (D);
                                                            min∗r (E) < maxr (E),
                     
                      >,
             ∗
      •    ∀r (E) = ⊥,                                      maxr (E) = 0,
                       lcs({∀r (E) u E 0 | E 0 ∈ ∃∗r (E)}), 0 < min∗r (E) = maxr (E);
                     
      •    Er0 is computed as follows:

       Let Er := ∃r (C) = {C1 , ..., Cn } and we define C \E as being C without the
       conjunct ∃r.E.
  5:   for i = 1 to n do
  6:      if (i) Γr (C \Ci u D) 6= ∅ and there exists D0 ∈ ∃r ((C \Ci u D)α ) forall
          α ∈ Γr (C \Ci u D) with ∀r (C) u ∀r (D) u D0 v Ci , or
          (ii) Γr (C \Ci u D) = ∅ and minr (C u D) ≥ 1 and ∀r (C) u ∀r (D) v Ci ,
          then
  7:         Er := Er \{Ci };
  8:      end if
  9:   end for
 10:   Er0 = {E ∗ | E ∈ Er } where E ∗ := ComputeDiff(E, ∀r (C) u ∀r (D)).
 11: end if

                         Fig. 1. The algorithm ComputeDiff


Let us now consider the value restrictions. In case 0 < min∗r (C u D) = maxr (C u
D), a value restriction ∀r (C u D) can be deduced from C u D, which is the least
common subsumer of all the existential restrictions appearing in ∃∗r (C u D).

Let us illustrate this case by an example.


            Cex1 := ∃r.(P u A) u ∃r.(P u B) u ∃r.¬Q u ∀r.(A u B),
            Dex1 := ∃r.(¬P u A) u ∃r.Q u (≤ 2r).
As demonstrated in section 3, the merging of existential restrictions in Cex1 u
Dex1 induces the value restriction ∀r.A. The value restriction of the difference
C−D is then
    ComputeDiff(∀r (C), ∀r (D) u ∀∗r (C u D)) = ComputeDiff(A u B, A) = B

One can verify condition 5(c) of Theorem 1, namely that B u C 0 v A u B for all
C 0 in the set of merged existential restrictions ∃∗r (Cex1 u Dex1 ) (c.f. the example
in Section 3).

Let us now illustrate the two cases for existential restrictions by means of exam-
ples.

 ◦ To illustrate case (i) let us take the following descriptions:


          Cex2 := ∃r.(P u A u B)u ≤ 2r,
          Dex2 := ∃r.(P u A) u ∃r.(P u B) u ∃r.(¬P u A) u ∃r.Q u ∃r.¬Q.
                                                    \C
    C1 = P uAuB subsumes P uQuAuB from (Cex21 uDex2 )α1 and P u¬QuAuB
           \C
    from (Cex21 u Dex2 )α2 . We then have

                           ComputeDiff(Cex2 , Dex2 ) =≤ 2r.

 ◦ Finally we illustrate case (ii) with the following example

                          Cex3 := P u ∃r.(A1 u A2 ) u (≥ 3r),
                          Dex3 := ∀r.(A1 u A2 u A3 ).


                        ComputeDiff(Cex3 , Dex3 ) = P u (≥ 3r)
    While ∃r.(A1 u A2 u A3 ) can be induced from ∀r.(A1 u A2 u A3 ) u (≥ 3r),
    one can verify that

          ComputeDiff(Cex3 , Dex3 ) u Dex3 ≡ P u (≥ 3r) u ∀r.(A1 u A2 u A3 )
          v P u (≥ 3r) u ∃r.(A1 u A2 u A3 ) u ∀r.(A1 u A2 u A3 )
          v P u (≥ 3r) u ∃r.(A1 u A2 ) u ∀r.(A1 u A2 u A3 )
          ≡ Cex3 u Dex3

    The case w is obvious.

   The next lemma proves that the algorithm returns a difference that respects
the condition of the difference operator (Section 2). The proof can be found in
our technical report [8].

Lemma 1. Let C, D be two ALEN -concept descriptions in ALEN normal
form. Then, ComputeDiff(C, D) u D ≡ C u D.
5   Application to Patent Claims Comparison
In this section we are going to illustrate the difference on some simple patent
examples.
Example 1. Suppose we have an application for a chair with only one leg having
a seat made only from a light material. The corresponding concept description
is the following:

                = 1hasLeg u ∃hasSeat.(∀hasM aterial.Light),

that we need to compare to a previous patent describing a chair with three legs
and having one seat made of light wood, given by the description

           = 3hasLeg u ∃hasSeat.(∀hasM aterial.(W ood u Light)).

   The algorithm returns ≤ 1hasLeg which corresponds to the specific part of
the application.

Example 2. Let us consider a patent application for a watch with two types of
displays or more that are all bright. The corresponding description is:

                   >= 2hasDisplay u ∀hasDisplay.Bright,

to compare to a watch with an analogue and a non analogue display described
as:
            ∃hasDisplay.Analogical u ∃hasDisplay.¬Analogical.
    The difference is ∀hasDisplay.Bright. Indeed, we can deduce the numeri-
cal restriction >= 2hasDisplay from the second concept description while the
existential restrictions involve disjoint concepts.


6   Conclusion
In this paper, we have investigated the problem of computing the syntactic dif-
ference in ALEN . This work was motivated by an application in the context of
patent applications valuation. In this context one needs to compare the claims
of the patent with previous patents solving a similar problem. The textual de-
scriptions of the claims can be described in a formal way using ALEN -concept
descriptions. The difference operator provides the user with the parts that are
contained in his application and not in the state of the art. Our aim is not to
provide a decision-making tool regarding the novelty of a patent application, but
rather a tool that can help in pinpointing the differences and assist the user in
constructing his argumentation.
    We are implementing a prototype of the difference algorithm based on the DL
reasoner HermiT [2] for the subsumption test. A direction for Future work would
be to investigate the decision making process based on the results returned by
the difference and empirical rules derived from experts decisions.
References
 1. The European Patent Office website: http://www.epo.org
 2. The Hermit OWL Reasoner website: http://hermit-reasoner.com
 3. The United States Patent and Trademark Office website: http://www.uspto.gov/
 4. Brandt, S., Kusters, R., Turhan, A.Y.: Approximation and difference in description
    logics. In: T.Dean (ed.) Proceedings of the Eighth International Conference on
    Principles of Knowledge Representation and Reasoning (KR2002). pp. 203–214.
    Morgan Kaufmann, San Francisco, CA (2002)
 5. Colucci, S., Noia, T.D., Sciascio, E.D., Donini, F.M., Mongiello, M.: Concept ab-
    duction and contraction in description logics. In: Description Logics (2003)
 6. Di Noia, T., Di Sciascio, E., Donini, F.M.: Semantic matchmaking as non-
    monotonic reasoning: a description logic approach. J. Artif. Int. Res. 29(1), 269–307
    (2007), http://dl.acm.org/citation.cfm?id=1622606.1622615
 7. Giereth, M., Koch, S., Kompatsiaris, Y., Papadopoulos, S., Pianta, E., Serafini,
    L., Wanner, L.: A modular framework for ontology-based representation of patent
    information. In: JURIX (2007)
 8. Karam, N., Paschke, A.: Patent valuation using difference in ALEN. Tech. rep.,
    Freie Universitat Berlin, Germany (2012),
    http://www.csw.inf.fu-berlin.de/publications/TR-DiffAlen.pdf
 9. Kusters, R.: Non-standard inferences in description logics, vol. 2100 of Lecture
    Notes in Artificial Intelligence. Springer-Verlag (2001)
10. Kusters, R., Molitor, R.: Computing least common subsumers in ALEN. In: Pro-
    ceedings of the 17th international joint conference on Artificial intelligence - Vol-
    ume 1. pp. 219–224. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA
    (2001), http://portal.acm.org/citation.cfm?id=1642090.1642120
11. Kusters, R., Molitor, R.: Structural subsumption and least common subsumers in
    a description logic with existential and number restrictions. Studia Logica 81(2),
    227–259 (2005)
12. Leger, A., Rey, C., Toumani, F.: Semantic difference in ALN. In: Description Logics
    (2007)
13. Noia, T.D., Sciascio, E.D., Donini, F.M.: A tableaux-based calculus for abduction
    in expressive description logics: preliminary results. In: Description Logics (2009)
14. Teege, G.: Making the difference: a subtraction operation for description logics.
    In: Doyle, J., Sandewall, E., Torasso, P. (eds.) KR’94: Principles of Knowledge
    Representation and Reasoning, pp. 540–550. Morgan Kaufmann, San Francisco,
    California (1994)
15. Wanner, L., Baeza-Yates, R., Brügmann, S., Codina, J., Diallo, B., Escorsa, E.,
    Giereth, M., Kompatsiaris, Y., Papadopoulos, S., Pianta, E., Piella, G., Puhlmann,
    I., Rao, G., Rotard, M., Schoester, P., Serafini, L., Zervaki, V.: Towards content-
    oriented patent document processing. World Patent Information 30(1), 21–33
    (2008)