<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Incoherence as a Basis for Measuring the Quality of Ontology Mappings</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christian Meilicke</string-name>
          <email>christian@informatik.uni-mannheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heiner Stuckenschmidt</string-name>
          <email>heiner@informatik.uni-mannheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Institute University of Mannheim</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Traditionally, the quality of ontology matching is measured using precision and recall with respect to a reference mapping. These measures have at least two major drawbacks. First, a mapping with acceptable precision and recall might nevertheless suffer from internal logical problems that hinder a sensible use of the mapping. Second, in practical situations reference mappings are not available. To avoid these drawbacks we introduce quality measures that are based on the notion of mapping incoherence that can be used without a reference mapping. We argue that these measures are a reasonable complement to the well-known measures already used for mapping evaluation. In particular, we show that one of these measures provides a strict upper bound for the precision of a mapping.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Assessing the quality of alignments is an important aspect of ontology matching. A
number of different measures have been proposed for this purpose. According to [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] it
can be distinguished between compliance measures, measures concerned with system
usability, and performance measures that focus on runtime or memory requirements.
A compliance measure compares a set of correspondences with a gold standard which
should be the complete set of all correct correspondences. The most prominent
compliance measures are precision and recall which have been adapted from information
retrieval to the field of schema and ontology matching [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. As complement to these
measures we propose a family of measures based on the definition of mapping
incoherence. These measures do not fall in one of the above mentioned categories, but should
be categorized as formal or logic-based measures.
      </p>
      <p>Compared to the widely used measures of precision and recall, the measures that
will be proposed in this paper do not rely on the existence of a gold standard (also
referred to as reference mapping). Contrary to this, they measure internal properties of a
mapping based on the semantics of the ontologies aligned via the mapping. This makes
our approach applicable in matching scenarios where we do not have a gold standard.
Measuring the incoherence of a mapping is motivated by the idea that the incoherence of
a mapping will hinder its sensible use even though it might contain a significant amount
of correct correspondences. Although we introduce incoherence as a new dimension for
quantifying mapping quality, there is a non-obvious relation to traditional measures. In
particular, we show that we can use one of the suggested measures to compute a strict
upper bound for the precision of a mapping. This result is surprising at first sight and
shows the significance of the overall approach.</p>
      <p>In the following section we discuss related work on quality measures for mappings
and explain how our approach extends existing work. In section 3 we recall and refine
the theory of mapping incoherence as basis for the following sections. Before
introducing incoherence measures, we discuss some problems caused by mapping incoherence
which justify the importance of measuring incoherence (section 4). In particular, we
explain the effects of incoherence with respect to the application scenario of data
transformation and query processing. In section 5 we introduce four incoherence measures
divided in two groups. Measures of the first group are concerned with the impact of
incoherence, while measures of the second group are used to measure the effort of
repairing an incoherent mapping. Finally, we show that one of these measures can be
used to compute a strict upper bound for mapping precision (section 6) followed by
some concluding remarks (section 7).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Several suggestions have been made to extend and introduce new evaluation measures
to the field of ontology matching. In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] Ehrig and Euzenat introduce relaxed precision
and recall. Their work is motivated by the idea that a correspondence of a mapping M
might not be totally incorrect even though it is not contained in reference mapping R.
Thus, it can be measured how close a correspondence is to a similar one in R. Amongst
others they suggest to measure the correction effort to transform such a correspondence
into a correct one. We pick up this idea and suggest measuring incoherence based on
the effort necessary to remove all causes of incoherence from M. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] Euzenat
introduces semantic precision and recall. These measures are based, roughly speaking, on
comparing the (bounded) deductive closure of R and M instead of a direct
comparison. Such an approach requires the use of logical reasoning where both correspondences
and ontologies are considered. While Euzenat focuses on the entailment of
correspondences, our approach accounts that certain combinations of correspondences result in
incoherence.
      </p>
      <p>
        Measuring mapping incoherence is closely related to measuring and repairing
ontology incoherence. Thus, we adapted some of the measures defined by Qi and Hunter
in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Later we will show how to reduce the incoherence of a mapping M to
concept unsatisfiability in an ontology that results from merging the ontologies matched
via M. An obvious way of measuring mapping incoherence is thus based on counting
the number of unsatisfiable concepts in the merged ontology. As proposed in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], it can
be distinguished between root and derived unsatisfiable concepts. Accordant to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
we pick up this distinction and distinguish between two types of measuring the impact
of incoherence based on concept unsatisfiability.
      </p>
      <p>
        In previous work [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] we have developed and tested strategies to repair incoherent
ontology mappings.1 These strategies rely on discarding individual correspondences
from an incoherent mapping M to finally arrive at a coherent submapping M∗ ⊆ M.
1 Notice that in previous work we misleadingly used the notion of inconsistency instead of
incoherence. Precise definitions of these notions are given in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Clearly, such a strategy should remove a minimal set of correspondences. This approach
leads to an incoherence measure that indicates the effort of repairing incoherent
mappings in numbers of correspondences to be removed. Moreover, it has been emphasized
that the confidence value of a correspondence plays an important role in mapping
repairing. Thus, we distinguish between a cardinality based measure and a confidence
based measure. Both measures quantify the effort of revising an incoherent mapping.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Foundations</title>
      <p>
        According to Euzenat and Shvaiko [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] a correspondence can be defined as a semantic
relation between ontological entities annotated with a confidence value.
      </p>
      <sec id="sec-3-1">
        <title>Definition 1 (Correspondence and Mapping). Given ontologies O1 and O2, let Q be</title>
        <p>a function that defines sets of matchable elements Q(O1) and Q(O2). A correspondence
between O1 and O2 is a 4-tuple he, e0, r, ni such that e ∈ Q(O1) and e0 ∈ Q(O2), r
is a semantic relation, and n is a confidence value from a suitable structure hD, 6i. A
mapping between O1 and O2 is a set of correspondences between O1 and O2.</p>
        <p>Definition 1 allows to capture a wide class of correspondences by varying what is
admissible as matchable element, semantic relation, and confidence value. In this work
we consider correspondences between named concepts and properties. We also restrict
correspondences to match entities of the same type, i.e. both e and e0 have to be concepts
or both have to be properties. We also restrict r to be ≡, v or w, i.e. we only focus on
equivalence and subsumption correspondences. Finally, we assume that the confidence
value, which can be seen as a measure of trust in the fact that the correspondence holds,
is represented numerically on D = [0.0, 1.0].</p>
        <p>
          As argued in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] the semantics of a mapping M between ontologies O1
and O2 can be defined in the context of merging O1 and O2 via M. In this section we
focus on technical aspects and postpone the discussion on adequacy and implications to
the next section. A merged ontology contains the axioms of O1 and O2 as well as the
correspondences of M translated into axioms of the merged ontology.
Definition 2 (Merged ontology). Let O1 and O2 be ontologies (finite sets of axioms).
The merged ontology O1 ∪Mt O2 of O1 and O2 connected by M is defined as O1 ∪Mt
O2 = O1 ∪ O2 ∪ {t(x) | x ∈ M} with t being a translation function that maps
correspondences to axioms.
        </p>
        <p>Notice that in O1 and O2 a concept or property might have the same local name. To
refer without ambiguity in the context of a merged ontology to an entity e which origins
from Oi we use prefix notation i#e in the following. There is a straightforward way to
translate concept correspondences and property correspondences into DL axioms. We
refer to the corresponding translation function as natural translation tn.
Definition 3 (Natural Translation). Given correspondence c = h1#e, 2#e0, r, ni
between ontologies O1 and O2, the natural translation tn of c is defined as
tn(c) 7→
 1#e ≡ 2#e0 if r =≡
 1#e v 2#e0 if r =v
 1#e w 2#e0 if r =w</p>
        <p>Now let us briefly recall the notion of ontology incoherence. An ontology O is
incoherent, iff there exist an unsatisfiable concept in O. Analogous, a mapping M
between O1 and O2 is called incoherent due to t, if there exists an unsatisfiable concept
i#Ci∈{1,2} in O1 ∪Mt O2 that is satisfiable in Oi. If there exists such a concept, its
unsatisfiability must have (at least partially) been caused by M.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Definition 4 (Incoherence of a Mapping). Given a mapping M between ontologies</title>
        <p>O1 and O2 and a translation function t. If there exists a concept i#C with i ∈ {1, 2}
such that O1 ∪Mt O2 |= ⊥ w i#C and Oi 6|= ⊥ w i#C then M is incoherent with
respect to O1 and O2 due to t. Otherwise M is coherent with respect to O1 and O2 due
to t.</p>
        <p>Obviously, the incoherence of a mapping is strongly affected by our choice of t. In the
following we use t = tn as translation function. Notice that it is possible to define an
alternative translation function. In particular, it will turn out that the measures introduced
in section 5 are independent of this choice with respect to their applicability, although
their results are affected by the choice of the translation function.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The Importance of Mapping Coherence</title>
      <p>
        One might argue, that mapping coherence is only important in a very specific
application scenario like reasoning in a merged ontology. In the following we show that
incoherence has a negative effect on a wide range of relevant applications. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] four
different purposes of using ontology mappings have been distinguished. A more
finegrained distinction has been proposed in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], but most of these scenarios can be
subsumed under one of these use cases.
      </p>
      <p>– Frameworks. Mappings are described in frameworks on an abstract level
independent of an intended use.
– Terminological Reasoning. Mappings are used to perform reasoning across aligned
ontologies.
– Data Transformation. Data from one ontology is transferred into the terminology
of another ontology based on the knowledge encoded in a mapping.
– Query Processing. Queries formulated with respect to a certain ontology are
translated into the terminology of a different ontology.</p>
      <p>The Frameworks use case is about describing mappings on an abstract level. Since
we try to argue for the applicability of our approach in a practical context, it is of minor
interest and will not be discussed. It is obvious that incoherence is undesirable in the
Terminological Reasoning case as incoherence will lead to inconsistency of the whole
ontology when instances are added to unsatisfiable concepts. Inconsistency, however,
disables meaningful reasoning as everything can be derived from an inconsistent
ontology. It is less obvious that coherence is important for the Data Transformation and
Query Processing use cases. In the following, we show that an incoherent mapping will
lead to serious errors in the context of data translation and query processing.
4.1</p>
      <sec id="sec-4-1">
        <title>Data Transformation</title>
        <p>To better understand the effects of incoherence in the context of Data Transformation
let us consider the following example. Suppose there are two companies C1 and C2.
Both use different ontologies, say O1 and O2, to describe human resources and related
topics. Now it happens that C2 takes over C1. C2 decides to migrate all instance data of
O1 into O2. O1 will no longer be maintained. A terminological mapping M between
O1 and O2 has to be created to migrate the instances of O1 to O2 in a fully automated
way.</p>
        <p>Fragments of ontologies O1 and O2 are depicted in figure 1. We refer to these
fragments throughout the whole section without explicitly mentioning it in the following.
Data Transformation can be roughly described as the following procedure.
1. For all instances a of O1 create a copy a0 of this instance in O2.
2. For all concept correspondences h1#e, 2#e0, r, ni ∈ M with r ∈ {≡, v} and for
all instances a with O1 |= 1#e(a) add axiom 2#e0(a0) to O2.
3. For all property correspondences h1#e, 2#e0, r, ni ∈ M with r ∈ {≡, v} and for
all instances a, b with O1 |= 1#e(a, b) add axiom 2#e0(a0, b0) to O2.
In the following we refer to the ontology resulting from migrating instances from Oi
to Oj based on mapping M as Oj ∪ M(Oi). At first sight, mapping coherence seems
to be irrelevant with respect to this use case, because we do not copy any of the
terminological axioms into Oj . Consider the following correspondences to understand why
this impression is deceptive.
(1)
(2)
(3)
(4)
h1#Person, 2#Person, ≡, 1.0i
h1#ProjectLeader , 2#Project , v, 0.6i
Let now mapping M contain correspondence (1) and (2). M is incoherent, due to the
fact that in O1 ∪Mt O2 concept 1#ProjectLeader becomes unsatisfiable. Concepts
2#Project and 2#Person are disjoint and due to M concept 1#ProjectLeader is
subsumed by both of them, resulting in its unsatisfiability. Suppose now, there exists an
instance a with 1#ProjectLeader (a). Applying the migration rules results in a new
instance a0 with 2#Project (a0) and 2#Person(a0). Due to the disjointness of 2#Project
and 2#Person there exists no model for O2 ∪ M(O1) and thus O2 ∪ M(O1) is an
inconsistent ontology. Opposed to our first impression there seems to be a tight link
between the incoherence of M and the inconsistency of O2 ∪ M(O1).</p>
        <p>Contrary to this, mappings can be constructed, where such a direct link cannot be
detected. Let M, for example, contain correspondences (3) and (4). M is incoherent
due to the unsatisfiability of 2#ProductLine in the merged ontology.</p>
        <p>h1#Deadline, 2#TimedEvent , v, 0.9i
h1#ProjectDeadline, 2#ProductLine, w, 0.7i
Now we have to acknowledge that both O2 ∪ M(O1) and O1 ∪ M(O2) do not become
inconsistent. But what happens if we first transfer all instances x of O2 to O1 ∪ M(O2)
and then again transfer the x0 instances of O1 ∪ M(O2) to O2 ∪ M(O1 ∪ M(O2))?</p>
        <p>Given some instance a with O2 |= 2#ProductLine(a) after the first step we have
the counterpart of a, namely a0, with O1 ∪ M(O2) |= 1#ProjectDeadline(a0) by
applying correspondence (4) and can derive O1 ∪ M(O2) |= 1#Deadline(a0). After the
second step we have O2 ∪ M(O1 ∪ M(O2)) |= 2#TimedEvent (a00) by applying
correspondence (3). We expect that adding axiom a = a00 does not affect the consistency of
O2 ∪ M(O1 ∪ M(O2)). But O2 ∪ M(O1 ∪ M(O2)) now implies 2#ProductLine(a)
as well as 2#Event (a). Since 2#ProductLine and 2#Event are defined to be disjoint,
there exists no model for O2 ∪ M(O1 ∪ M(O2)). Again, we find a strong link between
mapping incoherence and inconsistency after instance migration.
In the following we revisit a variant of the example given above to better understand the
use case of Query Processing. Again, company C2 takes over C1. But this time both O1
and O2 are maintained. Instead of migrating all instances from O1 to O2 queries are
rewritten at runtime to enable information integration between O1 and O2. A
terminological mapping is the key for information integration. It is used for processing queries
and generating result sets which contain data from both ontologies. As we are
concerned with theoretical issues, we argue on an abstract level instead of discussing e.g.
characteristics of a SPARQL implementation. A query language for DL based
knowledge bases should at least support instance retrieval for complex concept descriptions.
Depending on the concrete query language there might be a complex set of rewriting
rules. At least, it must contain variants of the following two rules.</p>
        <p>R1: Let i#C and i#D be concept descriptions in the language of Oi. If Oi |= i#C ≡
i#D , then query q can be transformed into an equivalent query by replacing all
occurrences of i#C by i#D .</p>
        <p>R2: Let i#C and j#D be concept descriptions in the language of Oi, respectively
Oj . If there exists a correspondence hi#C , j#D , ≡, ni ∈ M, then query q can be
transformed into an equivalent query by replacing all occurrences of i#C by j#D .
Suppose we query for the name of all project leaders, formally speaking we are
interested in the instances of ∃1#hasName−1.1#ProjectLeader . To receive instances of
both O1 and O2 we have to rewrite the query for O2. Now let M contain
correspondences (5), (6), and (7).</p>
        <p>h1#hasName, 2#name, ≡, 0.9i
h1#Project , 2#Project , ≡, 1.0i
h1#manages, 2#managerOf , ≡, 0.7i</p>
        <p>Suppose that O1 contains axiom 1#ProjectLeader ≡ ∃1#manages.1#Project .
We exploit this axiom by applying R1. Now for every concept and property name that
occurs in ∃1#hasName−1.∃1#manages.1#Project , there exists a direct counterpart
in O2 specified in M. By applying R2 we thus finally end with a concept description
in the language of O2.</p>
        <p>∃1#hasName−1.1#ProjectLeader
R1
⇐⇒ ∃1#hasName−1.∃1#manages.1#Project
R2
⇐⇒ ∃2#name−1.∃2#managerOf .2#Project
(5)
(6)
(7)
(8)
(9)
(10)
What happens if we process the query based on this concept description to O2? As result
we receive the empty set. The range of 2#managerOf is concept 2#ProductLine,
and 2#ProductLine is defined to be disjoint with 2#Project . Thus, for logical reasons
there exists no instance of concept description (10) in O2.</p>
        <p>This problem is obviously caused by the incorrectness of correspondence (7). But
the incorrectness of (7) does not only affect the query under discussion. It also causes
mapping M to become incoherent, because in the merged ontology O1 ∪Mt O2
concept 1#ProjectLeader becomes unsatisfiable due to its equivalence with concept
description ∃1#manages.1#Project . This time we find a strong link between mapping
incoherency and the incorrectness of a query result due to processing the mapping.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Measuring Incoherence</title>
      <p>The definition of mapping incoherence given above is a boolean criterion that only
distinguishes between coherent and incoherent mappings. Contrary to this, an incoherence
measure should satisfy m(O1, O2, M) &gt; m(O1, O2, M0) if M has a higher degree
of incoherence than M0. At the moment we might have an intuitive understanding of
different degrees of incoherence, but a precise definition has to be given in the
following subsections. Up to now, we define an incoherence measure to satisfy the following
constraint.</p>
      <sec id="sec-5-1">
        <title>Definition 5 (Incoherence Measure). Let M be a mapping between ontologies O1</title>
        <p>
          and O2 and let t be an translation function. An incoherence measure mt maps O1, O2,
and M to a value in [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] such that mt(O1, O2, M) = 0 iff M is coherent with respect
to O1 and O2 due to t.
        </p>
        <p>In the following we distinguish between effect-based and revision-based measures. The
former are concerned with the negative impact of mapping incoherence. The latter
measure the effort necessary to revise a mapping by removing incoherences. Both
approaches make it possible to extend the boolean property of incoherence to a continuous
measure of its degree.
5.1</p>
        <sec id="sec-5-1-1">
          <title>Measuring the Impact of Incoherence</title>
          <p>
            The first measure to be introduced is based on the idea of counting unsatisfiable
concepts. It is an adaption of an ontology incoherence measure introduced in [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ]. Before
we proceed, we need to agree on some abbreviations and naming conventions.
Definition 6. Let O be an ontology. Then CO(O) refers to the set of named concepts
in O and US (O) = {C ∈ CO(O) | O |= C v ⊥} refers to the set of unsatisfiable
concepts in O.
          </p>
          <p>Contrary to measuring incoherences in ontologies, we have to distinguish between
two types of concept unsatisfiability in the merged ontology: There are unsatisfiable
concepts in O1 ∪Mt O2 which have already been unsatisfiable in O1, respectively O2,
while there are unsatisfiable concepts which have been satisfiable in O1, respectively
O2. We are interested in the latter concepts. In particular, we compare the number of
these concepts with the number of all named concepts satisfiable in O1 or O2.</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>Definition 7 (Unsatisfiability Measure). Let M be a mapping between ontologies O1</title>
        <p>and O2, and let t be a translation function. Unsatisfiability measure mstat is defined by
mstat(O1, O2, M) = |US (O1 ∪Mt O2) \ (US (O1) ∪ US (O2))|</p>
        <p>|CO(O1 ∪Mt O2) \ (US (O1) ∪ US (O2))|</p>
        <p>
          This measure can be criticised for the following reason. Suppose again, we have an
incoherent mapping M for ontologies O1 and O2 depicted in figure 1. Suppose that due
to M concept 1#Person becomes unsatisfiable in the merged ontology. As a
consequence concept 1#ProjectLeader becomes unsatisfiable, too. By applying definition 7
we thus measure both direct impact (unsatisfiability of 1#Person) and indirect impact
(unsatisfiability of 1#ProjectLeader ) of M, even though we might only be interested
in the direct impact. The distinction between root and derived unsatisfiability, as
introduced in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], solves this problem. A precise definition requires us to recall the notion of
a MUPS, defined in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], which is minimal unsatisfiability preserving sub-TBox.
Definition 8 (MUPS). Let O be an ontology, let T ⊆ O be the TBox of O, and let
C ∈ US (O). A set T 0 ⊆ T is a minimal unsatisfiability preserving sub-TBox (MUPS)
in T for C if C is unsatisfiable in T 0 and C is satisfiable in every T 00 ⊂ T 0. The set of
all MUPS with respect to C is referred to as mups(O, C ).
        </p>
        <p>A MUPS for a concept C can be seen as a minimal explanation of its
unsatisfiability. Whenever there exists another unsatisfiable concept D such that the minimal
explanation of C ’s unsatisfiability also explains the unsatisfiability of D then C is
referred to as derived unsatisfiable concept, because one reason for C ’s unsatisfiability is
the unsatisfiability of D .</p>
      </sec>
      <sec id="sec-5-3">
        <title>Definition 9 (Derived and Root Unsatisfiability). Let O be an ontology and let C ∈</title>
        <p>US (O). C is a derived unsatisfiable concept if there exists D 6= C ∈ US (O) such that
there exist M ∈ mups(O, C ) and M 0 ∈ mups(O, D ) with M ⊇ M 0. Otherwise C is a
root unsatisfiable concept. The set of derived unsatisfiable concepts of O is referred to
as US D (O) and the set of root unsatisfiable concepts is referred to as US R(O).</p>
        <p>Similar to the unsatisfiability measure we can now introduce the root
unsatisfiability measure by considering only root unsatisfiable concepts instead of all unsatisfiable
concepts in the merged ontology.</p>
      </sec>
      <sec id="sec-5-4">
        <title>Definition 10 (Root Unsatisfiability Measure). Let M be a mapping between ontolo</title>
        <p>gies O1 and O2, and let t be a translation function. Root unsatisfiability measure mrtsat
is defined by
mrtsat(O1, O2, M) = |US R(O1 ∪Mt O2) \ (US (O1) ∪ US (O2))|</p>
        <p>|CO(O1 ∪Mt O2) \ (US (O1) ∪ US (O2))|</p>
        <p>
          Obviously, we have mstat(O1, O2, M) ≥ mrtsat(O1, O2, M) for each mapping M
between two ontologies O1 and O2. As argued above, the mrtsat measure has to be
preferred. Nevertheless, a non trivial algorithm is required to compute the set of root
unsatisfiable concepts as described in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] which makes the application of this measure
more expensive from a computational point of view.
5.2
        </p>
        <sec id="sec-5-4-1">
          <title>Measuring the Effort of Mapping Revision</title>
          <p>The second type of measure is concerned with the effort of revising incoherent
mappings. We use the term revision to describe the process of removing correspondences
from an incoherent mapping until a coherent submapping has been found. If a revision
is conducted by a domain expert we would, for example, be able to measure the time
necessary. Since we want our measure to be computable without any human
intervention, we thus have to think of an automated strategy to revise a mapping. Such a strategy
should obviously remove a minimum number of correspondences, because we would
like to keep as much information in the mapping as possible. The following measure is
based on this idea and compares the number of correspondences that would be removed
by such a strategy with the number of all correspondences in the mapping.
Definition 11 (Maximum Cardinality Measure). Let M be a mapping between
ontologies O1 and O2, and let t be a translation function. Maximum cardinality measure
mtcard is defined by
mcard(O1, O2, M) = |M \ M0|
t
|M|
where M0 ⊆ M is coherent with respect to O1 and O2 due to t and there exists no
M00 ⊆ M with |M00| &gt; |M0| such that M00 is coherent with respect to O1 and O2
due to t.</p>
          <p>Suppose there are incoherent mappings M1 and M2 with |M1| = |M2| = 10.
Further suppose, according to the naming convention in definition 11, we have |M01| = 8
and |M02| = 7. Thus, we have mtcard(O1, O2, M1) = 0.2 and mtcard(O1, O2, M2) =
0.3. But now suppose that all of the correspondences in M1 have the same confidence
value, say 1. Contrary to this, M2 differs with respect to the confidence value of its
correspondences. In particular, it turns out that all of the three correspondences that
have been removed have a very low confidence value compared to the remaining
correspondences. The following definition introduces the maximum trust measure which is
similar to the maximum cardinality measure but accounts for differences in the
confidence distribution.</p>
        </sec>
      </sec>
      <sec id="sec-5-5">
        <title>Definition 12 (Maximum Trust Measure). Let M be a mapping between ontologies</title>
        <p>
          O1 and O2, and let t be a translation function. Further, let conf : M → [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] be a
function that maps a correspondence on its confidence value. Maximum trust measure
mttrust is defined by
P
        </p>
        <p>conf (c)
mtrust(O1, O2, M) = c∈M\M0
t</p>
        <p>P conf (c)
c∈M
where M0 ⊆ M is coherent with respect to O1 and O2 due to t and there exists no
M00 ⊆ M with Pc∈M00 conf (c) &gt; Pc∈M0 conf (c) such that M00 is coherent with
respect to O1 and O2 due to t.</p>
        <p>
          This measure is derived from the algorithm already described in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] which can be
used to compute M0 for a specific type of mappings. Namely, one-to-one mappings
that contain only correspondences expressing equivalences between concepts. We also
used it in the context of mapping extraction [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Its application is motivated by the idea
that M \ M0 mainly contains incorrect correspondences given an appropriate
confidence distribution. We adapted this idea to introduce the mttrust measure as confidence
weighted complement to the mtcard measure.
        </p>
        <p>
          Notice that computing both of these measures requires to solve computational hard
problems. On the one hand only full-fledged reasoning guarantees completeness in
detecting unsatisfiability. On the other hand the underlying problem is the optimization
problem of finding a hitting set H ⊆ M of minimal cardinality (respectively minimal
confidence total) over the set all of minimal incoherent subsets of M. This problem is
known to be NP-complete [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Nevertheless, first experiments indicate that both
measures can be computed in acceptable time for small and medium sized ontologies.
6
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Truth and Coherence</title>
      <p>
        In the following we are concerned with an important interrelation between the classical
compliance measure of precision and the maximum cardinality measure of incoherence.
Philosophically speaking, we are interested in how far an incoherent mapping can truly
express semantic relations between ontological entities. Accordant to [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the precision
of a mapping can be defined as follows.
      </p>
      <p>Definition 13 (Precision). Given a mapping M between ontologies O1 and O2, let R
be a reference mapping between O1 and O2. The precision of M with respect to R is
defined as precision(M, R) = |M ∩ R| / |M|.</p>
      <p>In section 4 we argued that an incoherent mapping M causes different kinds of
problems when applied in a realistic scenario. More precisely, it is the incorrectness
of a correspondence that causes both the problem as well as the incoherence. Even
though incorrect correspondences not necessarily result in incoherence, we can be sure
that an incoherent mapping contains at least one incorrect correspondence. Thus, a
well-modeled reference mapping R will be coherent due to each translation function
t compatible with the mapping semantics accepted by the person who created R. The
following proposition corresponds to this consideration.</p>
      <sec id="sec-6-1">
        <title>Proposition 1 (Incoherence and Precision). Let R be a reference mapping between</title>
        <p>O1 and O2. Further let mapping M be incoherent and let R be coherent with respect
to O1 and O2 due to translation function t. Then we have precision(M, R) &lt; 1.
Proof. Given the coherence of R, it can be concluded that every subset of R is coherent,
too. Since M is incoherent, it is thus no subset of R, i.e M \ R 6= ∅. We conclude that
M ∩ R ⊂ M. It follows directly precision(M, R) &lt; 1.</p>
        <p>Notice that automatically generated mappings normally do not have a precision of 1.
Thus, the application of proposition 1 is only of limited benefit. Nevertheless, it can be
generalized in a non trivial way by exploiting the definition of the maximum cardinality
measure (definition 11). This generalization allows us to compute a non trivial upper
bound for mapping precision without any knowledge of R.</p>
      </sec>
      <sec id="sec-6-2">
        <title>Proposition 2 (Upper Bound for Precision). Let M be a mapping and R be a refer</title>
        <p>ence mapping between O1 and O2. Further let R be coherent with respect to O1 and O2
due to translation function t. Then we have precision(M, R) ≤ 1−mtcard(O1, O2, M).
Proof. Accordant to definition 11 let M0 ⊆ M be the coherent subset of M with
maximum cardinality. Further let be M∗ = M ∩ R, i.e. M∗ consist of all correct
correspondences in M. Since M∗ is a subset of R and R is coherent with respect to
O1 and O2 due to t, we conclude that M∗ is also coherent. It follows that |M∗| ≤ |M0|,
because otherwise M0 would not be the coherent submapping of maximum cardinality
contrary to definition 11. In summary, the following inequation holds.
precision(M, R) = |M ∩ R| = |M∗| &lt; |M0| = 1 − |M0 \ M| = mctard(O1, O2, M)
|M| |M| |M| |M|</p>
        <p>Proposition 2 reveals an important interrelation between coherence and precision.
The counterpart of mapping precision is the measure of recall. At first glimpse it seems
that recall and coherence describe independent properties of a mapping. Nevertheless,
there exists a non trivial relation between recall and coherence that allows to derive
comparative statements about the relative recall of two overlapping mappings in some
cases. Although this interrelation is not as significant as the the one expressed in
proposition 2, further theoretical considerations are required.</p>
        <p>The utility of proposition 2 in the evaluation process essentially depends on the
distance between the upper bound and the actual value of mapping precision. In particular,
measuring low values for the mtcard measure will lead to a poor differentiation with
respect to the precision that has to be expected. In initial experiments, not included in
this paper due to lack of space, first results indicate that the upper bound for mapping
precision strongly varies and can be used to filter out highly imprecise mappings.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>In this paper, we have discussed the notion of incoherence of ontology mappings and
its role in the assessment of automatically created mappings. There are two main
conclusions of this work. First, we conclude that incoherence is an important aspect of
mapping quality as incoherent mappings have undesirable effects on most relevant
application scenarios as we have demonstrated in section 4. Second, appropriate measures
of incoherence can help to assess the quality of a mapping even if no reference
mapping is available and thus precision and recall cannot be determined. In particular, the
measure of incoherence provided in definition 11 provides a strict upper bound for the
precision of a mapping and can therefore be used as a guideline for estimating the
performance of matching systems. In future work experimental studies will show in how
far the proposed measures can be effectively applied in the evaluation process.2
Acknowledgement The work has been partially supported by the German Science
Foundation (DFG) in the Emmy Noether Programme under contract STU 266/3-1.
2 A first beta-version of our system supporting the measures mstat, mctard, and mttrust based on a
slightly modified natural translation can be obtained on request.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Hong-Hai</surname>
            <given-names>Do</given-names>
          </string-name>
          , Sergey Melnik, and
          <string-name>
            <given-names>Erhard</given-names>
            <surname>Rahm</surname>
          </string-name>
          .
          <article-title>Comparison of schema matching evaluations</article-title>
          .
          <source>In Proc. of the GI-Workshop Web and Databases</source>
          , Erfurt, Germany,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Marc</given-names>
            <surname>Ehrig</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jerome</given-names>
            <surname>Euzenat</surname>
          </string-name>
          .
          <article-title>Relaxed precision and recall for ontology matching</article-title>
          .
          <source>In Proc. of the K-Cap 2005 Workshop on Integrating Ontology</source>
          , Banff, Canada,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Jerome</given-names>
            <surname>Euzenat</surname>
          </string-name>
          .
          <article-title>Semantic precision and recall for ontology alignment evaluation</article-title>
          .
          <source>In Proc. of th 20th International Joint Conference on Artificial Intelligence</source>
          , Hyderabad, India,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Jerome</given-names>
            <surname>Euzenat</surname>
          </string-name>
          and
          <string-name>
            <given-names>Pavel</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          .
          <source>Ontology Matching</source>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Peter</given-names>
            <surname>Haase</surname>
          </string-name>
          and
          <string-name>
            <given-names>Guilin</given-names>
            <surname>Qi</surname>
          </string-name>
          .
          <article-title>An analysis of approaches to resolving inconsistencies in DLbased ontologies</article-title>
          .
          <source>In Proc. of the International Workshop on Ontology Dynamics</source>
          , Innsbruck, Austria,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Aditya</given-names>
            <surname>Kalyanpur</surname>
          </string-name>
          , Bijan Parsia, Evren Sirin, and
          <string-name>
            <given-names>James</given-names>
            <surname>Hendler</surname>
          </string-name>
          .
          <article-title>Debugging unsatisfiable classes in OWL ontologies</article-title>
          .
          <source>Journal of Web Semantics</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Richard</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Karp</surname>
          </string-name>
          .
          <article-title>Reducibility among combinatorial problems</article-title>
          . In Complexity of Computer Computations. Plenum,
          <year>1972</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Christian</given-names>
            <surname>Meilicke</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heiner</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          .
          <article-title>Analyzing mapping extraction approaches</article-title>
          .
          <source>In Proc. of the ISWC 2007 Workshop on Ontology Matching</source>
          , Busan, Korea,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Christian</given-names>
            <surname>Meilicke</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heiner</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          .
          <article-title>Applying logical constraints to ontology matching</article-title>
          .
          <source>In Proc. of the 30th German Conference on Artificial Intelligence</source>
          , Osnabru¨ck, Germany,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Natasha</given-names>
            <surname>Noy</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heiner</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          .
          <article-title>Ontology alignment: An annotated bibliography</article-title>
          .
          <source>In Semantic Interoperability and Integration</source>
          , Dagstuhl, Germany,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Guilin</given-names>
            <surname>Qi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Anthony</given-names>
            <surname>Hunter</surname>
          </string-name>
          .
          <article-title>Measuring incoherence in description logic-based ontologies</article-title>
          .
          <source>In Proc. of the 6th International Semantic Web Conference</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Schlobach</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ronald</given-names>
            <surname>Cornet</surname>
          </string-name>
          .
          <article-title>Non-standard reasoning services for the debugging of description logic terminologies</article-title>
          .
          <source>In Proc. of 18th International Joint Conference on Artificial Intelligence</source>
          , Acapulco, Mexico,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>