=Paper= {{Paper |id=None |storemode=property |title=OntoMerge: A System for Merging DL-Lite Ontologies |pdfUrl=https://ceur-ws.org/Vol-969/paper2.pdf |volume=Vol-969 }} ==OntoMerge: A System for Merging DL-Lite Ontologies== https://ceur-ws.org/Vol-969/paper2.pdf
    OntoMerge: A System for Merging DL-Lite
                  Ontologies

            Zhe Wang1 , Kewen Wang2 , Yifan Jin2 , and Guilin Qi3,4
                      1
                       University of Oxford, United Kingdom
                         2
                            Griffith University, Australia
                          3
                            Southeast University, China
              4
                State Key Laboratory for Novel Software Technology
                       Nanjing University, Nanjing, China



      Abstract. Merging multi-sourced ontologies in a consistent manner is
      an important and challenging research topic. In this paper, we propose
      a novel approach for merging DL-LiteN    bool ontologies by adapting the
      classical model-based belief merging approach, where the minimality of
      changes is realised via a semantic notion, model distance. Instead of using
      classical DL models, which may be infinite structures in general, we
      define our merging operator based on a new semantic characterisation
      for DL-Lite. We show that subclass relation w.r.t. the result of merging
      can be checked efficiently via a QBF reduction. We present our system
      OntoMerge, which effectively answers subclass queries on the resulting
      ontology of merging, without first computing the merging results. Our
      system can be used for answering subclass queries on multiple ontologies.


1   Introduction
Ontologies are widely used for sharing and reasoning over domain knowledge,
and their underlying formalisms are often description logics (DLs). To effec-
tively answer queries, ontologies from heterogeneous sources and contributed
by various authors are often needed. However, ontologies developed by multiple
authors under different settings may contain overlapping, conflicting and inco-
herent domain knowledge. The ultimate goal of ontology merging is to obtain
a single consistent ontology that preserves as much knowledge as possible from
two or more heterogeneous ontologies. This is in contrast to ontology matching
[5], whose goal is to align entities (with different name) between ontologies, and
which is often a pre-stage of ontology merging.
     Existing merging systems often adopt formula-based approaches to deal with
logical inconsistencies [10; 9; 14]. Most of such approaches can be described as
follows: the system first combine the ontologies by taking their union; then, if any
inconsistency is detected (through a standard reasoning), it pinpoints the axioms
which (may) cause inconsistency; and finally, remove certain axioms to retain
consistency. However, such an approach is sometimes unsatisfactory because it
is not fine-grained either in the way it measures the minimality of changes, and
thus it is often unclear how close the result of merging is to the source ontologies
semantically; or in the way it resolve inconsistency. In [12], an attempt is made
to provide some semantic justification for the minimality of changes, however,
the result of merging is still syntax-dependant and is often a set of ontologies.
    On the other hand, model-based merging operators have been intensively
studied in propositional logic, which are syntax-independent and usually satisfy
more rationality postulates than formula-based ones. However, a major chal-
lenge in adapting model-based merging techniques to DLs is that DL models
are generally infinite structures and the number of models of a DL ontology is
infinite. Several notions of model distance are defined on classical DL models
for ontology revision [13]. Mathematically, it is possible to define a distance o
classical DL models. Such a distance is computationally limited as it is unclear
how to develop an algorithm for the resulting merging operator. A desirable so-
lution is to define ontology merging operators based on a suitable finite semantic
characterisation instead of classical DL models.
    In this paper, we focus on merging ontologies expressed as DL-Lite TBoxes,
which can be also accompanied with the ontology-based data access (OBDA)
framework for data integration [2]. We propose a novel approach for merging
ontologies by adapting a classical model-based belief merging approach, where
the minimality of changes is realised via a semantic notion, model distance.
Instead of using classical DL models, which may be infinite structures in general,
we define our merging operator based on the notion of types. We show that
subclass relation w.r.t. the result of merging can be checked efficiently via a
QBF reduction, which allows us to make use of the off-the-shelf QBF solvers [8].
We present our system OntoMerge, which effectively answers subclass queries
on merging results, without first computing the merging results. Our system can
be used to answer subclass queries on multiple ontologies.

2   A New Semantic Characterisation
In our approach, it is sufficient to consider a finite yet large enough signature.
A signature S is a union of four disjoint finite sets SC , SR , SI and SN , where
SC is the set of atomic concepts, SR is the set of atomic roles, SI is the set of
individual names and SN is the set of natural numbers in S. We assume 1 is
always in SN .
     Formally, given a signature S, a DL-LiteN
                                             bool language has the following syntax
[1]:
     R ← P | P−               S ← P | ¬P
     B←>|A| >nR               C ← B | ¬C | C1 u C2
where n ∈ SN , A ∈ SC and P ∈ SR . B is called a basic concept and C is called
a general concept. BS denotes the set of basic concepts on S. We write ⊥ for
¬>, ∃R for ≥ 1 R, and C1 t C2 for ¬(¬C1 u ¬C2 ). Let R+ = P , where P ∈ SR ,
whenever R = P or R = P − . A TBox T is a finite set of concept axioms of the
form C1 v C2 , where C1 and C2 are general concepts. An ABox A is a finite set
of membership assertions of the form C(a) or S(a, b), where a, b are individual
names. In this paper, an ontology is represented as a DL TBox.
    The classical DL semantics are given by models. A TBox T is consistent
with an ABox A if T ∪ A has at least one model. A concept or role is satisfiable
in T if it has a non-empty interpretation in some model of T . A TBox T is
coherent if all atomic concepts and atomic roles in T are satisfiable. Note that
a coherent TBox must be consistent. TBox T entails an axiom C v D, written
T |= C v D, if all models of T satisfy C v D. Two TBoxes T1 , T2 are equivalent,
written T1 ≡ T2 , if they have the same models.
    Now, we introduce a semantic characterisation for DL-Lite TBoxes in terms
of types. A type τ ⊆ BS is a set of basic concepts over S, such that > ∈ τ , and
> n R ∈ τ implies > m R ∈ τ for each pair m, n ∈ SN with m < n and each
(inverse) role R ∈ SR ∪ { P − | P ∈ SR }. Type τ satisfies basic concept B if
B ∈ τ , ¬C if τ does not satisfy C, and C1 u C2 if τ satisfies both C1 and C2 .
Given a TBox T , type τ satisfies T if τ satisfies concept ¬C1 t C2 for each axiom
C1 v C2 in T .
    For a TBox T , define TM(T ) to be the maximal set of types satisfying
the following conditions: (1) all the types in TM(T ) satisfy T ; (2) for each
type τ ∈ TM(T ) and each ∃R in τ , there exists a type τ 0 ∈ TM(T ) (possibly
τ 0 = τ ) containing ∃R− . A type τ is called a type model (T-model) of T if
τ ∈ TM(T ). Note that TM(T ) is uniquely defined for each TBox T . Note that
for a coherent TBox T , TM(T ) is exactly the set of all types satisfying T . Let
TM(Π) = TM(T1 ) × · · · × TM(Tn ) for Π = hT1 , . . . , Tn i.

Proposition 1. Given a TBox T , we have the following results:
 – T is consistent iff TM(T ) 6= ∅.
 – For a general concept C, C is satisfiable wrt T iff there exists a T-model in
   TM(T ) satisfying C.
 – For two general concepts C, D, T |= C v D iff either TM(T ) = ∅ or all
   T-models in TM(T ) satisfy C v D.
 – T ≡ T 0 iff TM(T ) = TM(T 0 ), for any TBox T 0 .

    Given a type τ , an individual a and an ABox A, we say τ is a type of a
w.r.t. A if there is a model I of A such that τ = {B | aI ∈ B I , B ∈ BS }. For
example, given A = {A(a), ¬B(b), C(c)}, type τ = {A, B} is a type of a, but
not a type of either b or c in A. For convenience, we will say a type of a when
the ABox A is clear from the context. Let TMa (A) be the set of all the types
of a in A if a occurs in A; and otherwise, TMa (A) be the set of all the types.
A set M of T-models satisfies an ABox A if there is a type of a in M , i.e.,
M ∩ TMa (A) 6= ∅, for each individual a in A.
Proposition 2. Given a TBox T and an ABox A, T ∪ A is consistent iff
TM(T ) ∩ TMa (A) 6= ∅ for each a in A.


3   Merging Operator
In this section, we introduce an approach to merging DL-Lite ontologies to obtain
a coherent unified ontology.
    An ontology profile is of the form Π = hT1 , . . . , Tn i, where Ti is the ontology
from the source n.o. i (1 ≤ i ≤ n). There are two standard definitions of integrity
constraints (ICs) in the classical belief change literature [3], the consistency-
and entailment-based definitions. We also allow two types of ICs for merging,
namely the consistency constraint (CC), expressed as a set Ac of data, and the
entailment constraint (EC), expressed as a TBox Te . We assume the IC is self-
consistent, that is, Te ∪ Ac is always consistent. For an ontology profile Π, a
CC Ac and a EC Te , an ontology merging operator is a mapping (Π, Te , Ac ) 7→
∇(Π, Te , Ac ), where ∇(Π, Te , Ac ) is a TBox, s.t. ∇(Π, Te , Ac )∪Ac is consistent,
and ∇(Π, Te , Ac ) |= Te .
    In classical model-based merging approaches, merging operators are often
defined by certain notions of model distances [11; 6]. We use S4S 0 to denote the
symmetric difference between two sets S and S 0 , i.e., S4S 0 = (S \ S 0 ) ∪ (S 0 \ S).
Given a set S and a tuple S = hS1 , . . . , Sn i of sets, the distance between S and
S is defined to be a tuple d(S, S) = hS4S1 , . . . , S4Sn i. For two n-element dis-
tances d and d0 , d  d0 if di ⊆ d0i for each 1 ≤ i ≤ n, where di is the i-th element
in d. Given two sets S and S 0 , define σ(S, S 0 ) = S if S 0 is empty, and otherwise,
σ(S, S 0 ) = { e0 ∈ S | ∃e00 ∈ S 0 s.t. ∀e ∈ S, ∀e0 ∈ S 0 , d(e, e0 ) 6≺ d(e0 , e00 ) }. In [6],
given a collection Ψ = {ϕ1 , . . . , ϕn } of propositional formulas, and some ECs
expressed as a propositional theory µ, the result of merging ϕ1 , . . . , ϕn w.r.t. µ
is the theory whose models are exactly σ(mod(µ), mod(Ψ )), i.e., those models
satisfying µ and having minimal distance to Ψ .
    Inspired by classical model-based merging, we introduce a merging opera-
tor in terms of T-models. For an ontology profile Π and an EC Te , we could
define the T-models of the merging to be a subset of TM(Te ) (so that Te is
entailed) consisting of those T-models which have minimal distance to Π, i.e.,
σ(TM(Te ), TM(Π)). However, this straightforward adoption does not take the
CC into consideration, and the merging result obtained in this way may not
be coherent. For example, let T1 = {A v ¬B}, T2 = {> v B}, Te = ∅, and
Ac = {A(a), B(a)}. Then, σ(TM(Te ), TM(hT1 , T2 i)) consists of only one type
{B}. Clearly, the corresponding TBox {A v ⊥, > v B} does not satisfy the CC,
and it is not coherent.
    Note that in the above example, once the merging result satisfies the CC, then
it is also coherent, because both concepts A and B are satisfiable. In general,
it is also the case that coherency can be achieved by applying certain CC to
merging. We introduce an auxiliary ABox A† in addition to the initial CC Ac ,
in which each concept and each role is explicitly asserted with a member. That
is, A† = {A(a) | A ∈ SC , a ∈ SI is a fresh individual for A} ∪ {P (b, c) | P ∈
SR , b, c ∈ SI are fresh individuals for P }. As assumed, SI is large enough for
us to take these auxiliary individuals. From the definition of CCs, the merged
TBox T must be consistent with all the assertions in A† , which assures all the
concepts and roles in T to be satisfiable. Based on this observation, we have the
following lemma.
Lemma 1. T is coherent iff T ∪ A† is consistent for any TBox T .
To ensure the coherence of merging, we only need to include A† into the CC.
    For the merging to be consistent with the CC Ac , from Proposition 2, the
T-model set M of the merging needs to satisfy Ac . That is, M needs to contain a
type of a for each individual a in Ac . However, σ(TM(Te ), TM(Π)) does not nec-
essarily satisfy this condition, as can be seen from the above example: TMa (Ac )
consists of a single type {A, B} and σ(TM(Te ), TM(Π)) ∩ TMa (Ac ) = ∅. Intu-
itively, for the merging to satisfy the CC, type {A, B} need to be added to the
T-models of merging. In general, the T-models of merging can be obtained by
extending (if necessary) the set σ(TM(Te ), TM(Π)) with at least one type of a
w.r.t. Ac for each individual a in Ac , and if there are multiple such types, choose
those with minimal distances.
    Based on the above intuitions, the definition of TBox merging is presented
as follows.

Definition 1. Let Π be an ontology profile, Te be a TBox, and Ac be an ABox.
Denote A∗ = Ac ∪ A† . The result of merging Π w.r.t. the EC Te and the CC
Ac , denoted ∇(Π, Te , Ac ), is defined as follows

               TM(∇(Π, Te , Ac )) = σ(TM(Te ), TM(Π))∪
                       [
                                 σ(TM(Te ) ∩ TMa (A∗ ), TM(Π)).
                      a occurs in A∗


From the definition, the T-models of the merging are constituted with two parts.
The first part consists of those T-models of Te (for the satisfaction of the EC)
with minimal distances to Π. The second part consists of types of a, for each
individual a in A∗ , which are added to the first part for the satisfaction of the CC.
These types are also required to be T-models of Te and have minimal distances
to Π. It is clear from Proposition 1 that the result of merging is unique up to
TBox equivalence.


4    QBF Reduction

In this section, we consider a standard reasoning problem for ontology merg-
ing, namely the subclass queries: whether or not the result of merging entails a
subclass relation C v D. We present a QBF reduction for this problem, which
allows us to make use of the off-the-shelf QBF solvers [8]. We assume that every
TBox in the ontology profile is coherent, and in this case, the T-models of a
TBox T are exactly those satisfying T .
    We achieve the reduction in three steps. Firstly, we introduce a novel proposi-
tional transformation for DL-Lite TBoxes. The transformation is inspired by [1],
which contains a transformation from a DL-Lite TBox into a theory in the one
variable fragment of first order logic. Considering T-models instead of classical
DL models allows us to obtain a simpler transformation to propositional logic
than theirs to first order logic.
   Function φ(·) maps a basic concept to a propositional variable, and a general
concept (resp., a TBox axiom) to a propositional formula.

           φ(⊥) = ⊥,      φ(A) = pA ,     φ(> n R) = pnR ,
           φ(¬C) = ¬φ(C),                φ(C1 u C2 ) = φ(C1 ) ∧ φ(C2 ),
           φ(C1 v C2 ) = φ(C1 ) → φ(C2 ).

Here, pA and pnR are propositional variables. We use VS to denote the set
of propositional variables corresponding to the basic concepts over S, and we
omit the subscript S in what follows for simplicity. We can see that φ(·) is a
bijective mapping between the set of DL-LiteN bool general concepts and the set of
propositional formulas only referring to symbols in VS and boolean operators ¬,
∧ and →.
    Naturally, given the mapping φ(·), an arbitrary propositional model may not
correspond to a type. We define a formula η whose models are exactly the set of
types. Let              ^              ^
                  η=                                pnR → pmR .
                      R+ ∈SR     m,n∈SN with m