=Paper= {{Paper |id=Vol-1265/paper2 |storemode=property |title=A Multi-strategy Approach for Detecting and Correcting Conservativity Principle Violations in Ontology Alignments |pdfUrl=https://ceur-ws.org/Vol-1265/owled2014_submission_2.pdf |volume=Vol-1265 |dblpUrl=https://dblp.org/rec/conf/owled/SolimandoJG14 }} ==A Multi-strategy Approach for Detecting and Correcting Conservativity Principle Violations in Ontology Alignments== https://ceur-ws.org/Vol-1265/owled2014_submission_2.pdf
       A Multi-strategy Approach for Detecting and
      Correcting Conservativity Principle Violations in
                  Ontology Alignments?

      Alessandro Solimando1 , Ernesto Jiménez-Ruiz2 , and Giovanna Guerrini1
                            1
                             DIBRIS, Università di Genova, Italy
               2
                   Department of Computer Science, University of Oxford, UK



       Abstract. In order to enable interoperability between ontology-based systems,
       ontology matching techniques have been proposed. However, when the generated
       mappings suffer from logical flaws, their usefulness may be diminished. In this
       paper we present a multi-strategy approach to detect and correct violations of the
       so-called conservativity principle where novel subsumption entailments between
       named concepts in one of the input ontologies are considered as unwanted. The
       practical applicability of the proposed approach is experimentally demonstrated
       on the datasets from the Ontology Alignment Evaluation Initiative (OAEI).


1   Introduction

Ontologies play a key role in the development of the Semantic Web and are being used
in many diverse application domains, ranging from biomedicine to energy industry.
An application domain can be modeled with different points of view and purposes.
This situation usually leads to the development of different ontologies that intuitively
overlap, but they use different naming and modeling conventions.
    The problem of (semi-)automatically computing mappings between independently
developed ontologies is usually referred to as the ontology matching problem. A num-
ber of sophisticated ontology matching systems have been developed in the last years
[5, 23]. Ontology matching systems, however, rely on lexical and structural heuristics
and the integration of the input ontologies and the mappings may lead to many un-
desired logical consequences. In [12] three principles were proposed to minimize the
number of potentially unintended consequences, namely: (i) consistency principle, the
mappings should not lead to unsatisfiable classes in the integrated ontology, (ii) locality
principle, the mappings should link entities that have similar neighbourhoods, (iii) con-
servativity principle, the mappings should not introduce new semantic relationships
between concepts from one of the input ontologies.
    The occurrence of these violations is frequent, even in the reference mapping sets
of the Ontology Alignment Evaluation Initiative3 (OAEI). Also manually curated align-
ments, such as UMLS-Metathesaurus [1] (UMLS), a comprehensive effort for integrat-
?
   This work was supported by the EU FP7 IP project Optique (no. 318338) and by the Italian
   PRIN 2010LHT4KM CINA.
 3
   http://oaei.ontologymatching.org/
ing biomedical knowledge bases, suffer from these violations. Violations of these prin-
ciples may hinder the usefulness of ontology mappings [25]. These principles has been
actively investigated in the last years (e.g., [17, 9, 12, 10, 16, 21]).
    In this paper we combine our previous approach [25] for detecting and solving con-
servativity principle violations, based on the assumption of disjointness [22] and the
projection of the input ontologies to Horn propositional logic, with a new one based on
Answer Set Programming [24], addressing violations between entities already involved
in a subsumption relationship. Our evaluation supports the effectiveness of the com-
bined approach in the detection and correction of an extended notion of conservativity
violations, without harming efficiency, even on the largest test cases of OAEI.
    The remainder of the paper is organised as follows. Section 2 presents the prelim-
inaries. Section 3 introduces our motivating scenario. Section 4 describes our method.
Section 5 presents the conducted evaluation. Section 6 provides a comparison with rel-
evant related work. Finally, Section 7 gives some conclusions and future work lines.


2    Preliminaries

This section introduces the representation of ontology mappings, the notions of seman-
tic difference and conservativity principle, and the basics on Answer Set Programming.
Representation of Ontology Mappings. Mappings are 5-tuples of the form
hid, e1 , e2 , n, ρi, with id a unique identifier, e1 , e2 entities in the vocabulary or signature
of the relevant input ontologies (i.e., e1 ∈ Sig(O1 ) and e2 ∈ Sig(O2 )), n a confidence
measure between 0 and 1, and ρ a relation between e1 and e2 , typically subsumption,
equivalence, or disjointness. Mappings are usually represented as OWL 2 axioms for
reusing the currently available OWL 2 reasoning infrastructure, but alternative formal
semantics have been proposed (e.g., [2]).
Semantic Consequences of the Integration. The ontology resulting from the inte-
gration of two ontologies O1 and O2 via a set of mappings M may entail axioms that
do not follow from O1 , O2 , or M alone. These new semantic consequences can be
captured by the notion of deductive difference [15].
    Intuitively, the deductive difference between O and O0 w.r.t. a signature Σ is the set
of entailments constructed over Σ that do not hold in O, but do hold in O0 . However,
no algorithm is available for computing it for DLs more expressive than EL, for which
the existence of tractable algorithms is still open [15]
Definition 1 (Approximation of the Deductive Difference). Let A, B be atomic con-
cepts from Σ ∪ {>, ⊥}, O and O0 be two OWL 2 ontologies, with Σ is a signature.
We define the approximation of the Σ-deductive difference between O and O0 (denoted
diff ≈      0
     Σ (O, O ) as the set of axioms of the form A v B satisfying: (i) O 6|= A v B, and
(ii) O0 |= A v B.
    In order to avoid the drawbacks of the deductive difference, in this paper we rely on
the approximation given in Definition 1. This approximation only requires comparing
the classification hierarchies of O and O0 provided by an OWL 2 reasoner, and it has
successfully been used in the past in the context of ontology integration [11].
Conservativity Principle. The conservativity principle (as formulated in [12]) states
that the integrated ontology OU = O1 ∪ O2 ∪ M should not induce any change in
the concept hierarchies of the input ontologies. That is, the sets diff ≈Σ1 (O1 , OU ) and
diff ≈
     Σ2 (O 2 , OU ) must be empty  for signatures Σ 1 = Sig(O 1 ) and Σ 2  = Sig(O2 ), re-
spectively. In [25] we proposed a basic variant of the conservativity principle where OU
is required not to introduce new subsumption relationships between concepts from one
of the input ontologies, unless they were already involved in a subsumption relationship
or they shared a common descendant. In this paper, in addition to these basic violations,
we also aim at addressing violations between concepts already involved in a subsump-
tion relationship (i.e., resulting in an equivalence between them), denoted as equiva-
lence conservativity principle violations, or simply equivalence violations. As in [25],
we assume that the mappings M are coherent w.r.t. O1 and O2 (i.e., diff ≈ Σ (O1 ∪O2 , OU )
does not contain any axiom A v ⊥, for any A ∈ Σ = Sig(O1 ∪ O2 )).
Definition 2 (Conservativity Principle Violations). Let O be one of the input ontolo-
gies and Sig(O) be its signature, let OU be the integrated ontology, and let A, B be
atomic concepts in Sig(O) ∪ {>, ⊥}. We define two sets of violations of OU w.r.t. O:
    – basic violations, denoted basicViol(O, OU ), as the set of A v B axioms satisfying:
      (i) A v B ∈ diff ≈    Sig(O) (O, OU ), (ii) O 6|= B v A, and (iii) there is no C in
      Sig(O) ∪ {>, ⊥} s.t. O |= C v A, and O |= C v B.
    – equivalence violations, denoted eqViol(O, OU ), as the set of A ≡ B axioms sat-
      isfying: (i) OU |= A ≡ B, (ii) A v B ∈ diff ≈      Sig(O) (O, OU ) and/or B v A ∈
      diff ≈
           Sig(O) (O, OU ).
As discussed, for basic violations the assumption of disjointness can be applied, that
is, if two atomic concepts A, B from one of the input ontologies are not involved in a
subsumption relationship nor share a common subconcept (excluding ⊥) they can be
considered as disjoint. Hence, the repair of these violations can be reduced to a con-
sistency repair, if the input ontologies are extended with sufficient disjointness axioms.
The same does not necessarily hold for equivalence violations, for which a suitable
repair algorithm, based on ASP, is introduced (see Section 4.2).
Answer Set Programming. ASP is a declarative programming language able to ex-
press complex search problems up to Σ2P complexity class (and its complement Π2P [6]).
Each ASP program is a set of rules of the form Head ← Body, where Head can be a
disjunction of atoms. More formally, a disjunctive rule has the form a1 ∨ . . . ∨ an ←
b1 , . . . , bk , not bk+1 , . . . , not bm . Rules with an empty Body (i.e., k = m = 0) are
called facts. ASP rules can also include optimization-oriented constructs for minimiz-
ing numeric variables or the cardinality of a predicate. These rules are called soft con-
straints, in opposition to hard constraints, which are rules with an empty Head. In ASP,
solutions are called stable models [6].

3     Running Example
In this section, we show an example involving the different kinds of conservativity
principle violations arising in the integration of two ontologies modeling knowledge in
oil and gas industry [25]. Table 1 shows the relevant fragments of the two ontologies.
                  Table 1. Fragments of the ontologies used in Optique project [25].
                Ontology O 1                                           Ontology O 2
          α1 WellBore v ∃belongsTo.Well                          β1 Exploration well v Well
          α2 WellBore v ∃hasOperator.Operator                    β2 Explor borehole v Borehole
          α3 WellBore v ∃locatedIn.Field                         β3 Appraisal exp borehole v Explor borehole
          α4 AppraisalWellBore v WellBore                        β4 Appraisal well v Well
          α5 ExplorationWellBore v WellBore                      β5 Field v ∃hasFieldOperator.Field operator
          α6 Operator v Owner                                    β6 Field operator u Owner v Field owner
          α7 Operator v Company                                  β7 Company v Field operator
          α8 Field v ∃hasOperator.Company                        β8 Field owner v Owner
          α9 Field v ∃hasOwner.Owner                             β9 Borehole v Continuant t Occurrent

                   Table 2. Ontology mappings for the vocabulary in O1 and O2 .
                                                       Mappings M
                       id      e1                          e2                                       n     ρ
                       m1 O1 :Well                O2 :Well                 0.9 ≡
                       m2 O1 :WellBore            O2 :Borehole             0.7 ≡
                       m3 O1 :ExplorationWellBore O2 :Exploration well     0.6 v
                       m4 O1 :ExplorationWellBore O2 :Explor borehole      0.8 ≡
                       m5 O1 :AppraisalWellBore O2 :Appraisal exp borehole 0.7 ≡
                       m6 O1 :Field               O2 :Field                0.9 ≡
                       m7 O1 :Operator            O2 :Field operator       0.7 w
                       m8 O1 :Company             O2 :Company              0.9 ≡
                       m9 O1 :hasOperator         O2 :hasFieldOperator     0.6 ≡
                       m10 O1 :Owner              O2 :Owner                0.9 ≡

                       Table 3. Example of conservativity principle violations.
σ Entailment:                                                     follows from:        basicViol(O i , O U )?eqViol(O i , O U )?
σ1 O2 :Explor borehole v O2 :Exploration well          m3 , m4                                   YES                NO
σ2 O1 :AppraisalWellBore v O1 :ExplorationWellBore β3 , m4 , m5                                  YES                NO
σ3 O2 :Field operator v O2 :Field owner           α6 , β6 , m7 , m10                             YES                NO
σ4 O1 :Company ≡ O1 :Operator
                                                  α7 , β7 , m7 , m8                               NO                YES
σ5 O2 :Field operator ≡ O2 :Company
σ6 O1 :Company v O1 :Owner                              σ4 , α 6                                 YES                NO
σ7 O2 :Company v O2 :Field owner                        σ3 , σ 5                                 YES                NO



                                                     0.9
                                                                         1
                                        Company1           Company2          F ield operator2
                                                     0.9
                                    1
                                                           0.7
                            Operator1                                                     F ield owner2
                                        1
                                                                                           1
                                            Owner1                               Owner2
                                                                 0.9

                                                                 0.9

Fig. 1. Graph representation of the fragment of the aligned ontology of Tables 1 involved in
conservativity violations. Dashed arcs represent inferred axioms, while red arcs are those involved
in equivalence violations. Each non-inferred arc is labeled with its confidence value.



    Assume that the set of mappings M in Table 2, between O1 and O2 , is generated
by an off-the-shelf ontology alignment system. As described in Section 2, mappings are
represented as 5-tuples; for example the mapping m2 suggests an equivalence relation-
ship between the entities O1 :WellBore and O2 :Borehole, with confidence 0.7.
Algorithm 1 Algorithm to detect and solve basic violations
Input: O1 , O2 : input ontologies; M: (coherent) input mappings;
Output: M0 : output mappings; R≈ : approximate repair; disj: number of disjointness rules
 1: hO10 , O20 i := ModuleExtractor(O1 , O2 , M)
 2: hP1 , P2 i := PropositionalEncoding(O10 , O20 )
 3: SI1 := StructuralIndex(O10 ), SI2 := StructuralIndex(O20 )
 4: SIU := StructuralIndex(O10 ∪ O20 ∪ M)
 5: hP1d , disj1 i := DisjointAxiomsExtension(P1 , SI1 , SIU )                                     . See Algorithm 2
 6: hP2d , disj2 i := DisjointAxiomsExtension(P2 , SI2 , SIU )
 7: hM0 , R≈ i := MappingRepair(P1d , P2d , M)                                              . See Algorithm 2 in [13]
 8: return hM0 , R≈ , disj1 + disj2 i




    The integrated ontology OU = O1 ∪ O2 ∪ M, however, violates the conservativity
principle, according to Definition 2, and introduces unwanted subsumption relation-
ships (see Table 3). Note that the entailments σ4 and σ5 are equivalence violations not
belonging to the set of basic violations, since O1 :Company and O1 :Operator (resp.
O2 :Field operator and O2 :Company) are involved in a subsumption relationship in O1
(resp. O2 ). As already discussed, these entailments cannot be repaired by the approach
introduced in [25], and in addition they also lead to basic violations (σ6 and σ7 ), as
shown in Figure 1. Note that equivalence violations σ4 , σ5 can be repaired by the ASP-
based approach. As shown in [25], any of these conservativity violations may hinder
the usefulness of the generated mappings, in a query answering scenario.


4      Methods

This section describes the algorithms composing our multi-strategy approach, one ad-
dressig basic violations (Section 4.1), the other the equivalence ones (Section 4.2).


4.1     Approximate Repair of Basic Conservativity Principle Violations

We have reduced the problem of detecting and solving basic violations, to a mapping
(incoherence) repair problem. Currently, our method uses the indexing and reasoning
techniques of LogMap, an ontology matching and mapping repair system [10, 13].
    Algorithm 1 shows the pseudocode of the implemented method. The algorithm ac-
cepts as input two OWL 2 ontologies, O1 and O2 , and a set of mappings M which are
coherent4 w.r.t. O1 and O2 . The problem size is reduced by extracting two locality-
based modules [3] (O10 and O20 ) using the entities involved in the mappings M as
seed signatures (line 1, Algorithm 1). The output is the number of added disjointness
during the process disj, a set of mappings M0 , and an approximate repair R≈ s.t.
M0 = M \ R≈ . The repair R≈ aims at solving most of the basic violations of M w.r.t.
O1 and O2 . We next describe the techniques used in each step.
Propositional Horn Encoding. The modules O10 and O20 are encoded as the Horn
propositional theories, P1 and P2 (line 2 in Algorithm 1). The encoding includes rules
of the form A1 ∧ . . . ∧ An → B, with Ai , B atomic concepts. For example, the concept
 4
     Note that M may be the result of a prior mapping (incoherence) repair process.
hierarchy provided by an OWL 2 reasoner (e.g., [18, 14]) is encoded as A → B rules,
while explicit disjointness axioms between atomic concepts are encoded as Ai ∧ Aj →
false. Note that input mappings M can already be seen as propositional implications.
Example 1. Consider the ontologies and mappings in Tables 1 and 2. Axiom β6 is en-
coded as rule Field operator ∧ Owner → Field owner, while the mapping m2 is trans-
lated into rules O1 :WellBore → O2 :Borehole, and O2 :Borehole → O1 :WellBore.

Structural Index. The concept hierarchies provided by an OWL 2 reasoner (excluding
⊥) and the explicit disjointness axioms of the modules O10 and O20 are efficiently in-
dexed using an interval labelling schema (line 3 in Algorithm 1). This structural index
allows us to answer many entailment queries over the concept hierarchy as an index
lookup operation (i.e., without the need of an OWL 2 reasoner).
Disjointness Axioms Extension. In order to reduce the (basic) conservativity problem
to a mapping incoherence repair problem, following the notion of assumption of dis-
jointness, we need to automatically add sufficient disjointness axioms into each mod-
ule Oi0 . However, additional disjointness axioms δ may lead to unsatisfiable classes in
Oi0 ∪ δ.
Example 2. Consider the axiom β9 from Table 1. Following the assumption of dis-
jointness a very naı̈ve algorithm would add disjointness axioms between Borehole,
Continuant and Occurrent, which would make Borehole unsatisfiable.
     For avoiding an extensive use of a costly OWL 2 reasoner, our method exploits the
propositional encoding and structural indexing of the input ontologies. Thus, checking
for unsatisfiabilities introduced by candidate disjointness axioms in Oi0 ∪ δ is restricted
to the Horn propositional case. We have implemented an algorithm to extend the propo-
sitional theories P1 and P2 with disjointness rules of the form A ∧ B → ⊥ (see lines
5-6 in Algorithm 1). This algorithm guarantees that, for every propositional variable A
in the extended propositional theory Pid (with i ∈ {1, 2}), the theory Pid ∪ {true → A}
is satisfiable. Note that this does not necessarily hold when considering the OWL 2
ontology modules, O10 and O20 , as discussed above.
     LogMap provides a structural index SI to check if two propositional variables (i.e.,
ontological classes) are disjoint (areDisj(SI, A, B)), they keep a sub/super-class rela-
tionship (inSubSupRel(SI, A, B)), or they share a descendant (shareDesc(SI, A, B)).
     Nonetheless, the massive addition of disjointness rules is, in general, prohibitive for
large ontologies [25]. Our key ingredient to achieve scalability is to focus only on the
cases where a basic violation occurs in the integrated ontology OU = O10 ∪ O20 ∪ M,
w.r.t. one of the ontology modules Oi0 (with i ∈ {1, 2}); i.e., adding a disjointness
axiom between each pair of classes A, B ∈ Oi0 s.t. A v B ∈ basicViol(Oi0 , OU ),
as in Definition 2. Algorithm 2 implements this idea for the Horn propositional case
and extensively exploits the structural indexing to identify the basic violations (line 2,
Algorithm 2). Note that this algorithm also requires to compute the structural index
of the integrated ontology and thus its classification with an OWL 2 reasoner (line 4,
Algorithm 1). The classification cost of the integrated ontology is usually much higher
than the classification of the input ontologies individually. However, this was not a
bottleneck in our experiments (see Section 5).
Algorithm 2 Disjointness axioms extension
Input: P: propositional theory; SI: structural index SIU : structural index of the union ontology
Output: P d : extended propositional theory;disj: number of disjointness rules
 1: disj := 0, P d := P
 2: for A → B ∈ BasicConservativityViolations(SI, SIU ) do
 3:     if not (areDisj(SI, A, B)) then
 4:         P d := P d ∪ {A ∧ B → f alse}
 5:         SI := updateIndex(SI, A u B → ⊥)
 6:         disj := disj + 1
 7:     end if
 8: end for
 9: return hP d , disji




Mapping Repair. Line 7 of Algorithm 1 uses the mapping (incoherence) repair algo-
rithm presented in [13]for the extended Horn propositional theories P1d and P2d , and the
input mappings M. The mapping repair process exploits the Dowling-Gallier algorithm
for propositional Horn satisfiability [4] and checks, for every propositional variable A ∈
P1d ∪P2d , the satisfiability of the propositional theory PA = P1d ∪P2d ∪M∪{true → A}.
Satisfiability of PA is checked in worst-case linear time in the size of PA , and the num-
ber of Dowling-Gallier calls is also linear in the number of propositional variables in
P1d ∪ P2d . The algorithm records the conflicting mappings involved in the unsatisfia-
bility, which will be considered for the subsequent repair process. The unsatisfiability
will be fixed by removing some of the identified mappings. In case of multiple options,
mapping confidence will be used as a differentiating factor.5
Example 3. Consider the propositional encoding P1 and P2 of the axioms of Table 1
and the mappings M in Table 2, seen as propositional rules. P1d and P2d have been
created by adding disjointness rules to P1 and P2 , according to Algorithm 2. For
example, P2d includes the rule ψ = O2 :Well ∧ O2 :Borehole → f alse. The map-
ping repair algorithm identifies the propositional theory P1d ∪ P2d ∪ M ∪ {true →
O1 :ExplorationWellbore} as unsatisfiable. This is due to the combination of the map-
pings m3 and m4 , the propositional projection of axioms β1 and β2 , and the rule ψ. The
mapping repair algorithm also identifies m3 and m4 as the cause of the unsatisfiability,
and discards m3 since its confidence is smaller than that of m4 (see Table 2).
    Algorithm 1 gives as output the number of added disjointness rules disj, a set of
mappings M0 , and an (approximate) repair R≈ such that M0 = M \ R≈ . M0 is
coherent w.r.t. P1d and P2d . Furthermore, the propositional theory P1 ∪ P2 ∪ M0 does
not contain any basic violation w.r.t. P1 and P2 (according to the propositional case of
Definition 2). However, our incomplete encoding cannot guarantee that O10 ∪ O20 ∪ M0
does not contain basic violations w.r.t. O10 and O20 . Nonetheless, our evaluation suggests
that the number of remaining violations after repair is typically small (see Section 5).

4.2     Repair of Equivalence Conservativity Principle Violations
In this section we describe CycleBreaker algorithm that detects equivalence violations
at the taxonomical level, and computes a minimal repair by means of a logic program 6 .
 5
     Note that the locality principle can be used to compute fresh confidence values, if needed [12].
 6
     Formal definitions and proofs are provided in an accompanying technical report [24].
Algorithm 3 CycleBreaker Algorithm
Input: Ontology O1 , Ontology O2 , Alignment M
Output: Repair ∆
 1: O M = alignOnto(O1 , O2 , M)
 2: G = createDigraph(O M )
 3: SCCSi = tarjan(G, Oi ),V      globalSCCs = tarjan(G)
 4: for S ∈ globalSCCs s.t. ¬ 2i=1 ΠOi (S) ∈ SCCsi do
 5:     ∆ ← ∆ ∪ ASP(S)
 6: end for
 7: return ∆




Algorithm Description. Algorithm 3 takes as input two ontologies O1 ,O2 , and an
alignment M between them. The aligned ontology OM is computed (line 1), and
createDigraph function (line 2) builds its graph representation, having its named con-
cepts as the set of vertices, and the subsumption axioms between them as (weighted)
arcs. Given that the aligned ontology is equivalent, here, to the classification of the
input ontologies plus the mappings, only inferred arcs between concepts of the same
input ontology may exist. Given that at least one cycle containing all the elements of
a strongly connected components (SCC) exists, cycle detection for a graph G0 can be
reduced to computing SCC(G0 ), the set of its SCCs. Tarjan’s algorithm [26] computes
the SCCs of the graph representations of the input ontologies (named local SCCs) and
of the aligned one (line 3). Note that not all the cycles leads to an equivalence violation.
We therefore distinguish between unsafe and safe cycles as those producing or not a
violation. For detecting all the unsafe cycles, it is sufficient to detect the problematic
SCCs, identified by the fact that at least one of the two projections on the input on-
tologies is not a local SCC (line 4). The ASP program of Listing 1.1 then computes a
minimal repair for each problematic SCC (line 5).
            Listing 1.1. ASP program computing a minimal repair for a problematic SCC.
r 0 : # domain v t x (X, O ) . # domain v t x (Y, P ) . # domain v t x ( Z , Q ) .
r 1 : r e a c h e s S a f e (X, Y) :− e d g e (X, Y, C , 0 ) , O=P , X! =Y .
r 2 : r e a c h e s S a f e (X, Z ) :− r e a c h e s S a f e (X, Y) , e d g e (Y, Z , C , 0 ) , O=P , P=Q, X! =Y, Y! =Z , X! =Z .
r 3 : r e a c h e s (X, Y) :− e d g e (X, Y, C ,M) , unremoved ( e d g e (X, Y, C ,M) ) , X! =Y .
r 4 : r e a c h e s (X, Z ) :− r e a c h e s (X, Y) , e d g e (Y, Z , C ,M) , unremoved ( e d g e (Y, Z , C ,M) ) , X! =Y,
         Y! =Z , X! =Z .
r 5 : unremoved ( e d g e (X, Y, C ,M) ) | removed ( e d g e (X, Y, C ,M) ) :− e d g e (X, Y, C ,M) , X! =Y .
r 6 : unremoved ( e d g e (X, Y, C , 0 ) ) :− e d g e (X, Y, C , 0 ) , X! =Y, O=P .
r 7 : u n s a f e C y c l e (Y) :− n o t r e a c h e s S a f e (Y, X) , r e a c h e s (Y, X) , r e a c h e s (X, Y) , O=P , X! =Y .
r 8 : :− u n s a f e C y c l e (Y ) .
r 9 : # m i n i m i z e [ removed ( e d g e (X, Y, C , 1 ) ) = C ] .

(r0) states that X, Y, Z are always vertices, (r1,r2) define transitive predicate reachesSaf e
expressing vertex reachability in the input ontologies (in isolation), (r3,r4) define tran-
sitive predicate reaches expressing vertex reachability in the aligned ontology (i.e.,
considering edges and unremoved mappings only), (r5) generates models, (r6) allows
mapping removal only, (r7) computes unsafe cycles in the graph, (r8) is a hard con-
straint forbidding the existence of unsafe cycles (r9) is a soft constraint that selects one
of the valid models having minimal total weight of removed mappings.
Repair Computation Using ASP. The input facts for the ASP program are the fol-
lowing kinds: (i) predicate vtx(X, O) encodes vertices, where X is a unique vertex id
(e.g., vertex label) and O ∈ {1, 2} is the index of the input ontology of the concept, (ii)
Algorithm 4 Conducted evaluation over the Optique and OAEI data sets
Input: O1 , O2 : input ontologies M: reference mappings for O1 and O2
 1: OU := O1 ∪ O2 ∪ M
 2: Store size of Sig(O1 ) (I), Sig(O2 ) (II) and M (III)
    Compute number of conservativity principle violations:
 3: basicViol := |basicViol(O1 , OU )| + |basicViol(O2 , OU )| (IV)                   . basic violations, as in Definition 2
 4: diff ≈ := |diff ≈                            ≈
                    Sig(O1 ) (O1 , OU )| + |diff Sig(O2 ) (O2 , OU )| (V)                  . general notion as in Section 2
 5: eqViol := |eqViol(O1 , OU )| + |eqViol(O2 , OU )| (VI)                     . equivalence violations, as in Definition 2
 6: Compute a repair R≈ using Algorithm 1 for O1 , O2 , M
 7: Store number of added disjointness disj (VII), time to compute disjointness rules td (VIII)
 8: Compute a repair RSCC using Algorithm 3 for O1 , O2 , M \ R≈
 9: Store size global time to compute the combined mapping repair tr (IX) and size of repair |RSCC | (X)
10: OU := O1 ∪ O2 ∪ M \ RSCC
    Compute number of remaining conservativity principle violations:
11: basicViol := |basicViol(O1 , OU )| + |basicViol(O2 , OU )| (XI)                                      . basic violations
12: diff ≈ := |diff ≈                            ≈
                    Sig(O1 ) (O1 , OU )| + |diff Sig(O2 ) (O2 , OU )| (XII)                                . general notion
13: eqViol := |eqViol(O1 , OU )| + |eqViol(O2 , OU )| (XIII)                                      . equivalence violations




predicate edge(X, Y, C, M ) encodes arcs, where X and Y are vertices, C is an integer
encoding for arc weight in [0 . . . 100], M is a boolean flag for mappings.

Example 4. The model (i.e., the computed repair) for the logic program on the follow-
ing facts, encoding the problematic SCC of Figure 1, is equal to mapping m7 of Table 2:
v t x ( Company1 , 1 ) . v t x ( O p e r a t o r 1 , 1 ) . v t x ( Company2 , 2 ) . v t x ( F i e l d o p e r a t o r 2 , 2 ) .
e d g e ( Company1 , Company2 , 9 0 , 1 ) . e d g e ( Company2 , Company1 , 9 0 , 1 ) .
e d g e ( Company2 , F i e l d o p e r a t o r 2 , 1 0 0 , 0 ) . e d g e ( F i e l d o p e r a t o r 2 , O p e r a t o r 1 , 7 0 , 1 ) .




5      Evaluation

In order to evaluate the practical feasibility of our approach, we have conducted the
evaluation in Algorithm 4 (Roman numbers refer to stored measurements) over the
Optique’s use case [25] and the ontologies and reference mapping sets of OAEI 2013.7
    Table 4 shows the size of the evaluated ontologies and mappings (I, II and III). For
the Conference dataset we have selected only 5 pair of ontologies for which the refer-
ence mappings lead to more than five conservativity principle violations. Note that we
count equivalence mappings as two subsumption mappings, and hence M represents
subsumption mappings. Table 4 also shows the conservativity principle violations for
the reference mappings (IV, V and VI). For LargeBio and Library the number is ex-
pecially large using both the basic variant and the general notion of the conservativity
principle.8 We have run the experiments shown in Table 5 on a desktop computer with
an AMD Fusion A6-3670K CPU and allocating 12 GB of RAM. The prototype uses
Clingo 3.0.59 ASP solver. The obtained results10 can be summarised as follows:
 7
    More information can be found in http://oaei.ontologymatching.org/2013/
 8
   No OWL 2 reasoner could classify the integrated SNOMED-NCI ontology. The OWL 2 EL
   reasoner ELK [14] computed a lower bound on the number of conservativity violations.
 9
   http://potassco.sourceforge.net/
10
   The computation times of lines 1-3 in Algorithm 1 were negligible with respect to the repair
   and disjointness addition times (tr and td ) and thus they were not included in the result tables.
               Table 4. Test cases and violations with original reference mappings.
                                               Problem size                             Original Violations
  Dataset        O1 ∼ O2                   I          II          III                IV           V         VI
                                      |Sig(O1 )| |Sig(O2 )|      |M|              basicViol    diff ≈     eqViol
  Optique        NPD∼BootsOnto           757       40,671          102              214         220        25
                 SNOMED∼NCI             51,179     24,040       36,405            >525,515    >546,181   >6,090
  LargeBio       FMA∼SNOMED             10,181     13,430       17,212            125,232      127,668   1,091
                 FMA∼NCI                3,720      6,551         5,821             19,740      19,799     427
  Anatomy        MO∼NCIAnat             2,747       3,306        3,032              1,321      1,335       26
  Library        STW∼TheSoz             6,575       8,376        6,322             42,045      42,872      895
                 cmt∼confof               89            75           32              11          11            0
                 conference∼edas         124           154           34              8            8            0
  Conference     conference∼iasted       124           182           28              9            9            0
                 confof∼ekaw              75           107           40              6            6            1
                 edas∼iasted             154           182           38              7            7            0


    Table 5. Results of our method to detect and solve conservativity principle violations.
                                          Solution size          Times                Remaining Violations
    Dataset       O1 ∼ O2                VII         X       VIII    IX                XI      XII      XIII
                                        #disj |RSCC |        td (s) tr (s)          basicViol diff ≈ eqViol
    Optique       NPD∼BootsOnto          214       44         2.22        1.94            0       0        0
                  SNOMED∼NCI           525,515   16,180       80          3,721        >51      >402      >2
    LargeBio      FMA∼SNOMED           125,232    8,369       30           269          0        33        0
                  FMA∼NCI              19,740     2,179       36           36          103       104       0
    Anatomy       MO∼NCIAnat            1,321     493         1.46        1.34            0       2        0
    Library       STW∼TheSoz            42,045   3,068        4.96         44             0       15       0
                  cmt∼confof              11       6          0.04        0.04            0       0        0
                  conference∼edas          8       6          0.07        0.07            0       0        0
    Conference    conference∼iasted        9       3          0.21        0.14            0       0        0
                  confof∼ekaw              6       5          0.04        0.03            0       0        0
                  edas∼iasted              7       4          0.21        0.16            1       1        0




  i The number of added disjointness rules disj (VII), as expected, coincides with the
    number of basic violations, that is computed in a reasonable time also for large
    testcases (it requires only 80 seconds to compute them in the SNOMED-NCI case).
 ii The computed repair RSCC (X) is rather aggressive and it can remove from 16%
    (Anatomy) up to 48% (Optique) of the mappings. However, we follow a better safe
    than sorry approach, and we prefer to remove as many violations as possible, rather
    than preserving potentially conflicting mapping sets.
iii Global repair time tr (IX) is small and it does not represent a bottleneck in spite of
    the large number of added disjointness rules and multiple applied techniques.
iv The basic violations (XI) are fully removed in the Optique, Anatomy and Library
    cases, and almost fully removed in the Conference and LargeBio ones.
 v The number of missed violations is only slightly higher when considering the gen-
    eral notion of the conservativity principle (XII), which suggests that basic variant is
    suitable in practice. Furthermore, these violations are also almost always removed.
vi Restricting to equivalence violations, they are almost completely removed by the
    combined approach (XIII). In order to evaluate the effectiveness of CycleBreaker,
    we also measured the number of unsolved equivalence violations after applying this
    repair algorithm, in isolation, on the input reference alignment. With the exception
    of SNOMED-NCI (failure rate of 0.9%), the violations are totally repaired.
6   Related Work
The conservativity principle, although indirectly, has been actively studied in the litera-
ture. For example, the assumption of disjointness was originally introduced by Schlobach
[22] to enhance the repair of ontologies that were underspecified in terms of disjoint-
ness axioms. As in our case, also [27, 17, 7], have focused on the addition of a small set
of disjointness axioms, for supporting large ontologies.
     Ontology matching systems have also dealt with the conservativity principle in or-
der to improve the precision (w.r.t. a reference alignment) of the computed mappings.
For example, ASMOV [9], Lily [28] and YAM++ [19] implements different heuristics
and patterns to minimise the violations of the conservativity principle. Lily, in particu-
lar, supports an incomplete detection of the equivalence violations, but lacks of a repair
technique. Our basic method solves conservativity violations by reducing the problem
to a consistency principle violation problem. Concretely, we have extended LogMap
matcher [10] but other mapping repair systems could be considered (e.g., Alcomo [16]
or AML [21]). Note that these systems only focused on the consistency principle.
     The work presented in [8] follows an opposite approach with respect to ours, and
considers conservativity violations as false positives, caused by the potential incom-
pleteness of the input ontologies. Hence, the correction strategy, instead of removing
mappings, aims at inserting subsumption axioms to the input ontologies, to enrich their
concept hierarchies. Authors in [20] also suggest that fixing the input ontologies, in a
mapping repair process, could be an alternative to mapping removal.


7   Conclusions and Future Work
In this paper we have presented a fully-automatic multi-strategy method to detect and
correct conservativity principle violations in practice. We have extended the detect and
repair algorithm for basic violations [25] with a repair algorithm based on logic pro-
gramming, tailored for equivalence violations. The conducted evaluation supports the
practical effectivity of our incomplete method. We plan to extend the approach while
keeping the current scalability properties. The proposed algorithms follow a “better safe
than sorry” approach, suitable for scenarios where the input ontologies are not modifi-
able (e.g., fully automatic ontology matching systems). Nevertheless we do not discard
to explore alternative methods to address the conservativity principle violations. We
also plan to involve domain experts in the assessment of the additional disjointness [7],
and to suggest extensions to the input ontologies [8] for violations recognised as false
positives. We will also evaluate the mappings computed by systems participating at
OAEI, and the integration of our techniques into LogMap matcher.


References
 1. Bodenreider, O.: The Unified Medical Language System (UMLS): integrating biomedical
    terminology. Nucleic Acids Research 32, 267–270 (2004)
 2. Borgida, A., Serafini, L.: Distributed Description Logics: Assimilating Information from
    Peer Sources. J. Data Sem. 1, 153–184 (2003)
 3. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular Reuse of Ontologies: The-
    ory and Practice. J. Artif. Intell. Res. 31, 273–318 (2008)
 4. Dowling, W.F., Gallier, J.H.: Linear-Time Algorithms for Testing the Satisfiability of Propo-
    sitional Horn Formulae. J. Log. Prog. 1(3), 267–284 (1984)
 5. Euzenat, J., Meilicke, C., Stuckenschmidt, H., Shvaiko, P., Trojahn, C.: Ontology Alignment
    Evaluation Initiative: Six Years of Experience. J. Data Sem. 15, 158–192 (2011)
 6. Faber, W.: Answer Set Programming. In: Reasoning Web. pp. 162–193 (2013)
 7. Ferré, S., Rudolph, S.: Advocatus Diaboli - Exploratory Enrichment of Ontologies with Neg-
    ative Constraints. In: Int’l Conf. on Knowl. Eng. (EKAW). pp. 42–56 (2012)
 8. Ivanova, V., Lambrix, P.: A Unified Approach for Aligning Taxonomies and Debugging Tax-
    onomies and their Alignments. In: Eur. Sem. Web Conf. (ESWC), pp. 1–15. Springer (2013)
 9. Jean-Mary, Y.R., Shironoshita, E.P., Kabuka, M.R.: Ontology Matching With Semantic Ver-
    ification. J. Web Sem. 7(3), 235–251 (2009)
10. Jiménez-Ruiz, E., Cuenca Grau, B.: LogMap: Logic-based and Scalable Ontology Matching.
    In: Int’l Sem. Web Conf. (ISWC). pp. 273–288 (2011)
11. Jiménez-Ruiz, E., Cuenca Grau, B., Horrocks, I., Berlanga, R.: Ontology integration using
    mappings: Towards getting the right logical consequences. In: Eur. Sem. Web Conf. (2009)
12. Jiménez-Ruiz, E., Cuenca Grau, B., Horrocks, I., Berlanga, R.: Logic-based Assessment of
    the Compatibility of UMLS Ontology Sources. J. Biomed. Semant. 2(Suppl 1), S2 (2011)
13. Jiménez-Ruiz, E., Meilicke, C., Cuenca Grau, B., Horrocks, I.: Evaluating Mapping Repair
    Systems with Large Biomedical Ontologies. In: Description Logics. pp. 246–257 (2013)
14. Kazakov, Y., Krötzsch, M., Simancik, F.: Concurrent Classification of EL Ontologies. In:
    Int’l Sem. Web Conf. (ISWC). pp. 305–320 (2011)
15. Konev, B., Walther, D., Wolter, F.: The Logical Difference Problem for Description Logic
    Terminologies. In: Int’l Joint Conf. on Automated Reasoning (IJCAR). pp. 259–274 (2008)
16. Meilicke, C.: Alignments Incoherency in Ontology Matching. Ph.D. thesis (2011)
17. Meilicke, C., Völker, J., Stuckenschmidt, H.: Learning Disjointness for Debugging Mappings
    between Lightweight Ontologies. In: Int’l Conf. on Knowl. Eng. (EKAW). pp. 93–108 (2008)
18. Motik, B., Shearer, R., Horrocks, I.: Hypertableau Reasoning for Description Logics. J. Artif.
    Intell. Res. (JAIR) 36, 165–228 (2009)
19. Ngo, D., Bellahsene, Z.: YAM++ : A Multi-strategy Based Approach for Ontology Matching
    Task. In: Int’l Conf. on Knowl. Eng. (EKAW). pp. 421–425 (2012)
20. Pesquita, C., Faria, D., Santos, E., Couto, F.M.: To repair or not to repair: reconciling correct-
    ness and coherence in ontology reference alignments. In: Ontology Matching (OM) (2013)
21. Santos, E., Faria, D., Pesquita, C., Couto, F.: Ontology Alignment Repair Through Modular-
    ization and Confidence-based Heuristics. arXiv:1307.5322 preprint (2013)
22. Schlobach, S.: Debugging and Semantic Clarification by Pinpointing. In: Eur. Sem. Web
    Conf. (ESWC), pp. 226–240. Springer (2005)
23. Shvaiko, P., Euzenat, J.: Ontology Matching: State of the Art and Future Challenges. IEEE
    Transactions on Knowl. and Data Eng. (TKDE) (2012)
24. Solimando, A., Guerrini, G.: Coping with Conservativity Principle Violations in Ontology
    Mappings. Tech. Rep. DIBRIS-TR-14-01, University of Genova (Jan 2014), ftp://ftp.
    disi.unige.it/person/SolimandoA/cycleMappingDbgExt.pdf
25. Solimando, A., Jiménez-Ruiz, E., Guerrini, G.: Detecting and Correcting Conservativity
    Principle Violations in Ontology-to-Ontology Mappings. In: Int’l Sem. Web Conf. (2014),
    ftp://ftp.disi.unige.it/person/SolimandoA/conservLogMap.pdf
26. Tarjan, R.: Depth-first Search and Linear Graph Algorithms. SIAM J. Comp. 1(2) (1972)
27. Völker, J., Vrandecic, D., Sure, Y., Hotho, A.: Learning Disjointness. In: Eur. Sem. Web
    Conf. (ESWC). pp. 175–189 (2007)
28. Wang, P., Xu, B.: Debugging Ontology Mappings: A Static Approach. Computing and In-
    formatics 27(1), 21–36 (2012)