Patent Valuation Using Difference in ALEN Naouel Karam and Adrian Paschke Department of Mathematics and Computer Science, Freie Universität Berlin Königin-Luise-Str. 24-26, 14195 Berlin, Germany naouel.karam@fu-berlin.de,paschke@inf.fu-berlin.de Abstract. In this paper we present an approach for patent claim com- parison based on the difference operator in description logics. The claims are represented using ALEN . An algorithm computing the difference be- tween two ALEN concept descriptions is proposed and its usefulness for patent application valuation is described by means of some simple ex- amples. 1 Introduction More and more, patent material is made available in electronic format, and most of the time in textual form. Examples of such services are the European patent Office (EPO) [1] and the United States Patent and Trademark Office (USPTO) [3]. The need for a formal specification of the patent content arose recently to help retrieval, examination and classification of patent applications. In this context one needs to evaluate the innovative degree of a patent appli- cation. A claim is the part of a patent application where the inventor specifies the invention attributes and its features, defining what can be protected by the patent law. The aim is to show the non-obviousness to a person with basic do- main skills and the novelty of the application compared to the state of the art. Patent claims are described in natural language. Terms used in the claims must find clear support in the preceding description part of the patent appli- cation such that the meaning of a term is clearly specified. Previous work had attempted to represent patent material using OWL ontologies [7, 15]. Statistical tools performing patent application evaluation are based on a set of parameters provided by the user. Based on those parameters, the overall score of a patent application is calculated. In this paper we present a more formal approach for patent claim comparison based on description logics1 . Patent application claims are formally represented as concept descriptions. When comparing two applications, the differences need to be pinpointed in order to prove the novelty. We compute this difference by using the difference operator and return the result to the user. The result must be intuitive and easy to understand in order to help the user construct his argumentation. 1 This work has been partially supported by the Fact Screening and Transformation Project (FSTP) funded by the Teles Pri AG: www.fstp-expert-system.com The difference operator computes the part of a concept that is not contained into another one. The difference operator has been defined in [14] as being the most general description that can be added to the second operand to obtain equivalence with the first. A second definition of the difference involving a syn- tactic criteria for minimality has been introduced in [4]. In our context we need to describe existential, universal and numerical re- strictions. The difference operator has been investigated for description logics al- lowing either numerical restrictions [12] or existential restrictions [4]. We propose an algorithm for ALEN based on the structural characterization of subsumption for ALEN [10, 11]. The paper is organized as follows: in section 2, we give the formal definitions of the difference operator and discuss the motivation for our choice. Section 3 recalls useful results for ALEN and introduces some notations. In section 4 we provide an algorithm for computing the difference in ALEN . Some examples of patent claims comparison are given in section 5 and Section 6 concludes the paper. 2 Operator Choice The need for providing parts of a description that are not contained in another one has been motivated by different applications and different definitions exist with their advantages and drawbacks. Informally speaking, the difference between two concept descriptions is the information contained in the first description and not in the second. The dif- ference operator allows to remove from a given description all the information contained in another description. The difference operation between two concept descriptions was first introduced by Teege [14]. Definition 1. (semantic difference) The difference between two concept descrip- tions C and D with C v D is given by C − D := max{E | E u D ≡ C} where max is defined with respect to subsumption. Later, the work in [9, 4] proposed a refinement of this definition by allow- ing the difference between incomparable descriptions (i.e. D is not required to subsume C) and taking the syntactic minimum (w.r.t. a subdescription ordering d ) instead of a semantic maximum. Definition 2. (syntactic difference) The difference between two incomparable concept descriptions C and D is defined as C − D := min{E | E u D ≡ C u D} where min is defined with respect to a so-called subdescription ordering (de- noted by d ), used to compare syntactical structures that we recall in the next definition. Definition 3. (Subdescriptions) Let C be an ALEN -concept description, C b is b d C) iff, a subdescription of C ( C 1. Cb = ⊥, or 2. Cb is obtained by removing from the top-level of C: >, concept names, num- ber restrictions, value restrictions, existential restrictions. For the remaining restrictions ∀r.E or ∃r.E, replace the concept description E with a subde- scription of E. As an example, consider the ALEN -concept description C: P u P u ∀r.P u ∃r.(P u ∃r.Q)u ≤ 3r A possible subdescription: b = P u ∀r.P u ∃r.(∃r.Q)u ≤ 3r C is obtained from C by removing P from the top-level of C and P from the subexpression ∃r.(P u ∃r.Q). Note that C b is equivalent to C and there exists no subdescription of C b owning this property. Cb is the minimal subdescription which is equivalent to C. In [12], the authors defined a semantic difference for ALN . They also compare the semantic and syntactic variants and state that the two approaches differ mainly on the bottom handling. Indeed, the bottom decomposition leads to more than one result for the semantic difference. On the other hand, the syntactic difference always generates a unique result that is a subdescription of the input description and hence more intuitive and comprehensive for the user. Another operator called “concept abduction“ has been introduced to provide an explanation for matchmaking results in an e-marketplace. Abduction was first defined for ALN [5] and has been recently extended to a more expressive DL SH, using tableau calculus [13]. This operator does not require subsumption between the concept descriptions neither and thus is more general than the semantic difference. As stated in [6], the result of the difference is included in the set of solutions of a CAP. In our context, we chose to use the syntactic difference for three reasons: 1. It does not require subsumption between the concept descriptions being compared, which is more realistic in real world applications, 2. it generates a unique result which is a syntactic variation of the input and hence more intuitive for the user, 3. when the inputs are free from insatisfiable subexpressions the result is equiv- alent to the semantic difference. 3 Preliminaries In this section we recall some results for subsumption in ALEN and define some notations that will be useful to derive the difference algorithm. The normal form of an ALEN concept description is obtained by applying the following transformation rules: – (≥ mr) u (≥ nr) → (≥ nr) if n ≥ m, – (≤ mr) u (≤ nr) → (≤ nr) if n ≤ m, and – ∀r.C u ∀r.D → ∀r.(C u D). To access the different component of a concept description, we will use the following notations: – prim(C) denotes the set of concept names and the bottom concept occurring on the top-level of C, – infr (C) = (≤ nr) if there exists a number restriction of the form (≤ nr) on the top-level of C, infr (C) = > otherwise, – supr (C) = (≥ nr) if there exists a number restriction of the form (≥ nr) on the top-level of C, supr (C) = > otherwise, – minr (C) = max{k | C v (≥ kr)}, – maxr (C) = min{k | C v (≤ kr)}, – ∀r (C) = E if there exists a value restriction ∀r.E on the top-level of C; ∀r (C) = > otherwise, – ∃r (C) = {C 0 | ∃r.C 0 occurs on the top-level of C}. For the sake of clarity, we will assume that the set NR is reduced to the role r. The algorithm can be easily generalized to arbitrary sets of role names. The difference algorithm is based on the structural characterization of sub- sumption in ALEN given in [10]. In the sequel we present the intuitions used to compute induced concept descriptions and recall the structural subsumption characterization. We also introduce some notations that we will use in the dif- ference algorithm. The main issue when dealing with ALEN is to compute the ”non-trivial” concept descriptions subsuming a concept description C; such concepts do not appear in the top-level of C and are called ”induced”. Let us recall this process by means of examples. Number restrictions: They can be induced if two existential restrictions involve disjoint concepts. For example, if C := ∃r.P u ∃r.¬P then ≥ 2r is induced by C. Let us note the induced number restrictions as follows: min∗r (C). Note that if min∗r (C) = k, we have C v≥ kr which means that min∗r (C) is taken into account in minr (C) defined above. Existential restrictions: Due to ≤-number restrictions, if the number of existential restrictions in the top- level of a concept C is greater than the ≤-number restriction, those existential restrictions must be merged. For example, if we have: C := ∃r.(P u A) u ∃r.(P u B) u ∃r.(¬P u A) u ∃r.Q u ∃r.¬Qu ≤ 2r, the merging process results in new concepts descriptions where the only con- sistent ones are: ∃r.(P u Q u A u B) u ∃r.(¬P u ¬Q u A) ≤ 2r, and ∃r.(P u ¬Q u A u B) u ∃r.(¬P u Q u A) ≤ 2r. In [10], each merging is represented by a mapping α leading to a concept C α and the set of mappings is denoted by Γr (C). The mappings in Γr (C) are required to obey the following conditions: α(i) 6= ∅ for all 1 ≤ i ≤ n; 1. S 2. d1≤i≤n α(i) = {1, ..., m} and α(i) ∩ α(j) = ∅ for all 1 ≤ i < j ≤ n; 3. j∈α(i) Cj u ∀r (C) 6≡ ⊥ for all 1 ≤ i ≤ n, where n := min{maxr (C), | ∃r (C) |} and m :=| ∃r (C) |. In our example, this set consists of two mappings α1 and α2 leading to the two concepts C α1 and C α2 above with C ≡ C α1 t C α2 . The set of C 0 such that ∃r.C 0 in C α for all mappings is noted as follows: [ ∃∗r (C) := ∃r (C α ) α∈Γr (C) For our example we obtain: ∃∗r (C) := {P u Q u A u B, ¬P u ¬Q u A, P u ¬Q u A u B, ¬P u Q u A} In case ∃r (C) = ∅ (there is no existential restriction on the top level of C) one needs to take into account the ≥-restrictions together with value restrictions to deduce non-trivial existential restrictions. Let us illustrate the case on the following concept: D := (≥ 2r) u ∀r.(A u B). The instances of D have at least two r-successors and due to the value restriction we can deduce that D v ∃r.(A u B). This concludes the existential restrictions case. Value restrictions: It is clear from the former example that every instance of C has exactly two r-successors, in that case one can deduce from the existential restrictions in C α1 and C α2 that all r-successors belong to A and thus that the value restriction ∀r.A is induced by C. This deduction can be performed only when all r-successors are known and it is represented by the condition: maxr (C) = min∗r (C). The second case where a value restriction can be induced is when maxr (C) = 0 then C v ∀r.⊥. 4 Syntactic Difference between ALEN -Concept Descriptions The algorithm computing the difference between two ALEN -concept descrip- tions C and D is depicted in figure 1. We extended the algorithm proposed in [9] to compute the difference between two ALE-concept descriptions. Before going into detail about the algorithm, and as it is based on the struc- tural characterization of the subsumption [10], let us recall the conditions for each kind of conjunct occurring on the top level of the concept descriptions. Theorem 1. Let C, D be two ALEN -concept descriptions with ∃r (C) = {C1 , ..., Cn }. Then C v D iff C ≡ ⊥, or D ≡ >, or the following conditions hold: 1. prim(C) ⊆ prim(D), 2. maxr (C) ≤ maxr (D), 3. minr (C) ≥ minr (D), 4. for all D0 ∈ ∃r (D) it holds that (a) ∃r (C) = ∅, minr (C) ≥ 1, and ∀r (C) v D0 ; or (b) ∃r (C) 6= ∅, and for each α ∈ Γr (C), there exists C 0 ∈ ∃r (C α ) such that C 0 u ∀r (C) v D0 . 5. if ∀r (C) 6≡ >, then (a) maxr (C) = 0; or (b) minr (C) < maxr (C) and ∀r (C) v ∀r (D); or (c) 0 < minr (C) = maxr (C) and ∀r (C) u C 0 v ∀r (D) for all C 0 ∈ ∃∗r (C). According to the definition of the difference (c.f Definition 2), the output of the algorithm must verify: ComputeDiff(C, D) u D ≡ C u D Let start with concept names and numerical restrictions. One can easily verify conditions 1.−3. of Theorem 1: • The set of primitive concepts prim(ComputeDiff(C, D) u D) is equal to prim(ComputeDiff(C, D)) ∪ prim(D), which by definition is equal to (prim(C)\prim(D)) ∪ prim(D). This is again equal to prim(C) ∪ prim(D), the set of primitive concepts of (C u D). • For maxr (ComputeDiff(C, D) u D), we distinguish two cases: it is equal to maxr (D), when maxr (C) ≥ maxr (D) and to maxr (infr (C) u D), when maxr (C) < maxr (D). Both cases are equal to maxr (C u D). • Analogously to the precedent case, for minr (ComputeDiff(C, D) u D) we have two cases: equal to minr (D) when minr (C) ≤ minr (D) and to minr (supr (C)u D) when minr (C) > minr (D). Which again are equal to minr (C u D). Require: Two ALEN -concept descriptions C and D in ALEN -normal form Ensure: ComputeDiff(C, D) 1: if C u D ≡ ⊥ then 2: ComputeDiff(C, D) := ⊥ 3: else 4: ComputeDiff(C, D) := uA∈prim(C−D) A u inf dr (C−D) u supr (C−D) u ∀r.ComputeDiff(∀r (C), ∀r (D) u ∀∗r (C u D)) u E∈E 0 ∃r.E, r where the value restriction is omitted in case ComputeDiff(∀r (C), ∀r (D) u ∀∗r (C u D)) ≡ > and: • prim(C-D) :=prim(C)\prim(D); >, maxr (C) ≥ maxr (D), • infr (C-D) = infr (C), maxr (C) < maxr (D);  >, minr (C) ≤ minr (D), • supr (C-D) = supr (C), minr (C) > minr (D); min∗r (E) < maxr (E),   >, ∗ • ∀r (E) = ⊥, maxr (E) = 0, lcs({∀r (E) u E 0 | E 0 ∈ ∃∗r (E)}), 0 < min∗r (E) = maxr (E);  • Er0 is computed as follows: Let Er := ∃r (C) = {C1 , ..., Cn } and we define C \E as being C without the conjunct ∃r.E. 5: for i = 1 to n do 6: if (i) Γr (C \Ci u D) 6= ∅ and there exists D0 ∈ ∃r ((C \Ci u D)α ) forall α ∈ Γr (C \Ci u D) with ∀r (C) u ∀r (D) u D0 v Ci , or (ii) Γr (C \Ci u D) = ∅ and minr (C u D) ≥ 1 and ∀r (C) u ∀r (D) v Ci , then 7: Er := Er \{Ci }; 8: end if 9: end for 10: Er0 = {E ∗ | E ∈ Er } where E ∗ := ComputeDiff(E, ∀r (C) u ∀r (D)). 11: end if Fig. 1. The algorithm ComputeDiff Let us now consider the value restrictions. In case 0 < min∗r (C u D) = maxr (C u D), a value restriction ∀r (C u D) can be deduced from C u D, which is the least common subsumer of all the existential restrictions appearing in ∃∗r (C u D). Let us illustrate this case by an example. Cex1 := ∃r.(P u A) u ∃r.(P u B) u ∃r.¬Q u ∀r.(A u B), Dex1 := ∃r.(¬P u A) u ∃r.Q u (≤ 2r). As demonstrated in section 3, the merging of existential restrictions in Cex1 u Dex1 induces the value restriction ∀r.A. The value restriction of the difference C−D is then ComputeDiff(∀r (C), ∀r (D) u ∀∗r (C u D)) = ComputeDiff(A u B, A) = B One can verify condition 5(c) of Theorem 1, namely that B u C 0 v A u B for all C 0 in the set of merged existential restrictions ∃∗r (Cex1 u Dex1 ) (c.f. the example in Section 3). Let us now illustrate the two cases for existential restrictions by means of exam- ples. ◦ To illustrate case (i) let us take the following descriptions: Cex2 := ∃r.(P u A u B)u ≤ 2r, Dex2 := ∃r.(P u A) u ∃r.(P u B) u ∃r.(¬P u A) u ∃r.Q u ∃r.¬Q. \C C1 = P uAuB subsumes P uQuAuB from (Cex21 uDex2 )α1 and P u¬QuAuB \C from (Cex21 u Dex2 )α2 . We then have ComputeDiff(Cex2 , Dex2 ) =≤ 2r. ◦ Finally we illustrate case (ii) with the following example Cex3 := P u ∃r.(A1 u A2 ) u (≥ 3r), Dex3 := ∀r.(A1 u A2 u A3 ). ComputeDiff(Cex3 , Dex3 ) = P u (≥ 3r) While ∃r.(A1 u A2 u A3 ) can be induced from ∀r.(A1 u A2 u A3 ) u (≥ 3r), one can verify that ComputeDiff(Cex3 , Dex3 ) u Dex3 ≡ P u (≥ 3r) u ∀r.(A1 u A2 u A3 ) v P u (≥ 3r) u ∃r.(A1 u A2 u A3 ) u ∀r.(A1 u A2 u A3 ) v P u (≥ 3r) u ∃r.(A1 u A2 ) u ∀r.(A1 u A2 u A3 ) ≡ Cex3 u Dex3 The case w is obvious. The next lemma proves that the algorithm returns a difference that respects the condition of the difference operator (Section 2). The proof can be found in our technical report [8]. Lemma 1. Let C, D be two ALEN -concept descriptions in ALEN normal form. Then, ComputeDiff(C, D) u D ≡ C u D. 5 Application to Patent Claims Comparison In this section we are going to illustrate the difference on some simple patent examples. Example 1. Suppose we have an application for a chair with only one leg having a seat made only from a light material. The corresponding concept description is the following: = 1hasLeg u ∃hasSeat.(∀hasM aterial.Light), that we need to compare to a previous patent describing a chair with three legs and having one seat made of light wood, given by the description = 3hasLeg u ∃hasSeat.(∀hasM aterial.(W ood u Light)). The algorithm returns ≤ 1hasLeg which corresponds to the specific part of the application. Example 2. Let us consider a patent application for a watch with two types of displays or more that are all bright. The corresponding description is: >= 2hasDisplay u ∀hasDisplay.Bright, to compare to a watch with an analogue and a non analogue display described as: ∃hasDisplay.Analogical u ∃hasDisplay.¬Analogical. The difference is ∀hasDisplay.Bright. Indeed, we can deduce the numeri- cal restriction >= 2hasDisplay from the second concept description while the existential restrictions involve disjoint concepts. 6 Conclusion In this paper, we have investigated the problem of computing the syntactic dif- ference in ALEN . This work was motivated by an application in the context of patent applications valuation. In this context one needs to compare the claims of the patent with previous patents solving a similar problem. The textual de- scriptions of the claims can be described in a formal way using ALEN -concept descriptions. The difference operator provides the user with the parts that are contained in his application and not in the state of the art. Our aim is not to provide a decision-making tool regarding the novelty of a patent application, but rather a tool that can help in pinpointing the differences and assist the user in constructing his argumentation. We are implementing a prototype of the difference algorithm based on the DL reasoner HermiT [2] for the subsumption test. A direction for Future work would be to investigate the decision making process based on the results returned by the difference and empirical rules derived from experts decisions. References 1. The European Patent Office website: http://www.epo.org 2. The Hermit OWL Reasoner website: http://hermit-reasoner.com 3. The United States Patent and Trademark Office website: http://www.uspto.gov/ 4. Brandt, S., Kusters, R., Turhan, A.Y.: Approximation and difference in description logics. In: T.Dean (ed.) Proceedings of the Eighth International Conference on Principles of Knowledge Representation and Reasoning (KR2002). pp. 203–214. Morgan Kaufmann, San Francisco, CA (2002) 5. Colucci, S., Noia, T.D., Sciascio, E.D., Donini, F.M., Mongiello, M.: Concept ab- duction and contraction in description logics. In: Description Logics (2003) 6. Di Noia, T., Di Sciascio, E., Donini, F.M.: Semantic matchmaking as non- monotonic reasoning: a description logic approach. J. Artif. Int. Res. 29(1), 269–307 (2007), http://dl.acm.org/citation.cfm?id=1622606.1622615 7. Giereth, M., Koch, S., Kompatsiaris, Y., Papadopoulos, S., Pianta, E., Serafini, L., Wanner, L.: A modular framework for ontology-based representation of patent information. In: JURIX (2007) 8. Karam, N., Paschke, A.: Patent valuation using difference in ALEN. Tech. rep., Freie Universitat Berlin, Germany (2012), http://www.csw.inf.fu-berlin.de/publications/TR-DiffAlen.pdf 9. Kusters, R.: Non-standard inferences in description logics, vol. 2100 of Lecture Notes in Artificial Intelligence. Springer-Verlag (2001) 10. Kusters, R., Molitor, R.: Computing least common subsumers in ALEN. In: Pro- ceedings of the 17th international joint conference on Artificial intelligence - Vol- ume 1. pp. 219–224. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2001), http://portal.acm.org/citation.cfm?id=1642090.1642120 11. Kusters, R., Molitor, R.: Structural subsumption and least common subsumers in a description logic with existential and number restrictions. Studia Logica 81(2), 227–259 (2005) 12. Leger, A., Rey, C., Toumani, F.: Semantic difference in ALN. In: Description Logics (2007) 13. Noia, T.D., Sciascio, E.D., Donini, F.M.: A tableaux-based calculus for abduction in expressive description logics: preliminary results. In: Description Logics (2009) 14. Teege, G.: Making the difference: a subtraction operation for description logics. In: Doyle, J., Sandewall, E., Torasso, P. (eds.) KR’94: Principles of Knowledge Representation and Reasoning, pp. 540–550. Morgan Kaufmann, San Francisco, California (1994) 15. Wanner, L., Baeza-Yates, R., Brügmann, S., Codina, J., Diallo, B., Escorsa, E., Giereth, M., Kompatsiaris, Y., Papadopoulos, S., Pianta, E., Piella, G., Puhlmann, I., Rao, G., Rotard, M., Schoester, P., Serafini, L., Zervaki, V.: Towards content- oriented patent document processing. World Patent Information 30(1), 21–33 (2008)