=Paper=
{{Paper
|id=None
|storemode=property
|title=Chinese whispers and connected alignments
|pdfUrl=https://ceur-ws.org/Vol-689/om2010_Tpaper3.pdf
|volume=Vol-689
|dblpUrl=https://dblp.org/rec/conf/semweb/KutzNMW10
}}
==Chinese whispers and connected alignments==
<pdf width="1500px">https://ceur-ws.org/Vol-689/om2010_Tpaper3.pdf</pdf>
<pre>
              Chinese Whispers and Connected Alignments

        Oliver Kutz1 , Immanuel Normann2 , Till Mossakowski3 , and Dirk Walther4
        1
            Research Center on Spatial Cognition (SFB/TR 8), University of Bremen, Germany
                                   okutz@informatik.uni-bremen.de
               2
                 Department of Linguistics and Literature, University of Bremen, Germany
                                        normann@uni-bremen.de
            3
              DFKI GmbH and SFB/TR 8 Spatial Cognition, University of Bremen, Germany
                                      Till.Mossakowski@dfki.de
                 4
                   Faculty of Informatics, Technical University of Madrid (UPM), Spain
                                           dwalther@fi.upm.es


       Abstract. This paper investigates the idea to treat repositories of ontologies as inter-
       linked networks of ontologies, formally captured by the notion of a hyperontology. We
       apply standard matching algorithms to automatically create the linkage structure of the
       repository by performing pairwise matching. Subsequently, we define a modular work-
       flow to construct combinations of alignments for any finite number of ontologies. This
       workflow employs and makes interoperable several tools from the ontology engineering
       world, comprising matching, reasoning, and structuring tools, and supports in particular
       modular ontology extraction based on alignment, and a study and empirical analysis of
       (in)consistency propagation in connected alignments (the Chinese Whispers problem).


Keywords: Hyperontologies; Connected Alignments; Modularity; Consistency

1    Introduction and Problem Description

Ontology matching and alignment based on statistical methods is a relatively developed
field, with yearly competitions since 2004 comparing the various strengths and weaknesses
of existing algorithms.5 In this paper, we aim at exploring the degrees to which statistical
alignment may lead to inconsistency in the merged ontologies. More precisely, we aim at
investigating the effects, both theoretically and practically, of connected alignment,
i.e. aligning several ontologies that match (non-trivially) pairwise. Our general approach
is to treat large repositories of ontologies (in the order of hundreds of ontologies) as our
starting point to perform pairwise matching in order to obtain an interlinked network of
ontologies. Formally, such networks are captured by the notion of a hyperontology [6].
Our work in progress is intended to answer questions such as the following:
         Assuming pairwise alignments are consistent, how, and when, can we align fur-
     ther ontologies (in various orders) before we drift into inconsistency? In particular,
     how, and when, can we reduce the question of consistency of aligned ontologies and
     satisfiability of matched concepts to the consistency of aligned sub-ontologies (i.e.
     modules generated by the matched sub-signatures)?
5
    See http://oaei.ontologymatching.org/2009/
    We here set up the theoretical background and necessary engineering environment
to give meaningful answers to such questions. In our related paper [9], we have studied
techniques of information hiding to support the visualisation of the linkage structure and
to allow a user to explore the complex networks resulting from pairwise matching on
large sets of ontologies. We here focus on the last question mentioned above, namely how
to reduce the consistency problem in an aligned network of ontologies to the consistency
of merged modules generated by respective alignments, and the corresponding interoper-
ability problem between matching, modularity, and structuring tools. We also study the
effects of matching ontologies in different orders by looking at some specific examples.
Synonymy and Alignment as Colimit Computation. An essential part of the
matching and alignment process is to relate and identify signature elements from different
ontologies (possibly formulated in different ontology languages). Formally, this is captured
by the notion of a signature morphism.6 In the case of OWL ontologies, these are
type-preserving symbol mappings of the form σ : Sig(O1 ) → Sig(O2 ), i.e. mapping the
signature of O1 (= Sig(O1 )) to that of O2 , i.e. concepts to concepts, individuals to
individuals, and roles to roles.7 For a fixed ontology language, a signature morphism
straightforwardly induces a sentence translation map.
    V-Alignments [13] abstractly capture the alignment process for synonymous signa-
ture elements. Given ontologies O1 and O2 , an interface (for O1 , O2 )

                            Σ, σ1 : Σ −→ Sig(O1 ), σ2 : Σ −→ Sig(O2 )

specifies that (using informal but suggestive notation)
 – concepts σ1 (c) in O1 and σ2 (c) in O2 are identified for each concept c in Σ, regardless
   of whether the concepts have the same name or not, and
 – concepts in O1 \ σ(Σ1 ) and O2 \ σ(Σ2 ) are kept distinct, again regardless of whether
   they have the same name or not.
The resulting ontology O is not given a priori, but rather it is computed from the aligned
ontologies via the interface. This computation is a pushout in the sense of category
theory, which in this case is just a disjoint union with identification of specific parts
(namely those given through hΣ, σ1 , σ2 i).
    V-alignments can deal with basic alignment problems such as synonymy (identify-
ing different symbols with the same meaning) and homonymy (separating (accidentally)
identical symbols with different meaning)—see Fig. 1.

Example 1. In Fig. 1, the interface hΣ, σ1 , σ2 i specifies that the two instances of the
concept Woman as well as Person and Human are to be identified. This yields two concepts
Woman and Human_Being in the push-out ontology O obtained along the dashed arrows.
It also determines that the two instances of Bank are to be understood as homonyms,
and thus generates two new distinct concepts.                                           a
6
    See e.g. [5] for the general institutional definition.
7
    We use the DL terminology concept name and role interchangeably with the OWL terminology class
    and property.
   Notion such as polysemy, however, are typically understood to relate terms that have
a different, but related meaning, and can thus not be dealt with by simply identifying
symbols or keeping them apart.8 Similarly, [13] raise the criticism that V-Alignments do
not cover the case where a concept Woman in O1 is aligned with a concept Person in O2 :
here, the merged ontology should turn Woman into a subconcept of Person.
                                   {Woman, River_Bank, Financial_Bank, Human_Being}


                                                            
                                                          - O


                       O1                                                                 - O2
                                           σ1                                 σ2
                       


                                                                                               
                   {Woman, Bank, Person}                                              {Woman, Bank, Human}
                                                            Σ


                                                            =
                                                {Woman_Woman, Person_Human}


Fig. 1. V-alignment: merge through interface (dashed arrows are automatically computed via colimits)


    Whilst this is not directly possible with pushouts, we are here only interested in
matching synonyms across a network of ontologies, and for this purpose, V-alignments
(and their compositions) are sufficient. Studying more complex alignment operations we
leave for future work. We next turn to the problem of aligning several ontologies at once.
Consistency and Chinese Whispers. The game of Chinese Whispers9 is played
as follows: n persons are arranged in a certain (typically circular) order such that for each
person Pi there is a j such that Pi exchanges a message with Pj . The point of the game
is to observe the distortion of the message as it travels from P1 along the communication
channel. We here are interested in the effects of playing Chinese Whispers with ontologies,
where the pairwise matching replaces the transmission of a message, i.e. the messages
being exchanged are of the form: “Oi and Oj agree that concept C of Oi is synonymous
with concept D of Oj ”.
    We make the following idealisations concerning ‘matching’ a) we assume that in
pairwise matching the order does not matter, i.e. matching O1 with O2 yields the same
colimit ontology (i.e. alignment) as matching O2 with O1 10 ; b) matching algorithms are
‘not transitive’, i.e., matching hO1 , O2 i and hO2 , O3 i and computing the colimit yields,
in general, a different result than matching and aligning hO1 , O3 i,11 c) we assume that
we do not match ontologies with themselves.12
8
   This problem can be addressed by considering E-connections as a general form of alignment (see [6]).
9
   In the United States, “Telephone” is the most common name for the game. The name “Chinese whis-
   pers” reflects the former stereotype in Europe of the Chinese language as being incomprehensible.
   Although it is sometimes considered offensive in the US, it remains the common British English name
   for the game and is not generally regarded as being offensive.
10
   Whether or not this holds for actual matching systems is an implementational artefact which we ignore;
   the assumption is certainly reasonable to make as both ‘agreement’ and ‘synonymy’ are symmetric.
11
   In other words, the composition of the two alignments (the composition operation is easily seen to
   be associative via a pullback operation, see [13]) will typically not agree with a matcher’s results
   comparing O1 and O3 directly.
12
   Although one would expect to get the identity matching in such a case, actual matching tools behave
   sometimes rather unpredictable in these cases and often return no results.
    With these assumptions in place, given a repository R with N ontologies, we start
with l = N ×(N
             2
               −1)
                   matching pairs.
    Playing chinese whispers on R with k ≤ N players now means to pick a connected
subgraph of the hyperontology graph (to ensure that each ontology ‘talks’ to at least one
other), which we call a matching configuration . Note that, by assumption, matching
configurations contain no loops (i.e. reflexive vertices), are undirected (because of the
assumed symmetry of the matching results), and contain at most one edge between two
vertices. Therefore, matching configurations are connected simple graphs.


                                                                 k=3


                                                                 k=2


            Fig. 2. The number of non-isomorphic matching configurations for k = 2, 3


   Given a fixed k, the largest possible matching configuration (measured in pairs of
matched ontologies, i.e. edges) corresponds to a clique with k nodes,     P i.e. a complete
subgraph of the hyperontology graph with k nodes: such a graph has k−1       i=1 i edges.
   In practise, not every pair of ontologies will match, i.e. a matcher will report no syn-
onyms. Therefore, for fixed k ≤ N ontologies O1 , . . . , Ok , the number of non-isomorphic
matching configurations containing the Oi (i = 1, . . . , k) corresponds to the number of
connected components of the hyperontology graph with these ontology nodes. The cases
of N = 2, 3 (assuming a clique) are illustrated in Fig. 2. The alignment operation on a
matching configuration, i.e. the computation of the colimit of that graph, we call a con-
nected alignment. The Chinese Whispers Problem is now the following question:

       For what kinds (or shapes) of matching configurations and exchanged matching
   results does the consistency of the input ontologies propagate to the merge (colimit)
   of the matching configuration?

    We will give some partial answers to this problem in Sec. 2 by showing that the
consistency of an aligned matching configuration is reducible to the consistency of the
alignment of ‘reasonably’ large modules talking only about the matched signatures. In
Sec. 3, we will discuss in detail an alignment of three ontologies involving the Dolce
ontology, with pairwise consistent alignments, but an overall inconsistent one. Moreover,
in Sec. 4, we will describe how the results of a matcher comparing ontologies O1 and O2
(and giving rise to a V-alignment) can be rewritten into a structured ontology for further
processing with our tool Hets introduced below, and describe the workflow employing
standard ontology matching and reasoning tools. Finally, Sec. 5 describes related and
future work.
2   Modularity in Hyperontologies

In the following, we will make precise what we mean by a module and define the notion
of conservativity. We start with some auxiliary notions. Let Σ be a signature containing
concept names and roles. Let Sen(Σ) be the set of sentences formulated using the symbols
in Σ in some language. An ontology O in signature Σ is then simply a subset of Sen(Σ).
The sentences of course depend on the language the ontologies under consideration are
formulated in, e.g. OWL or some fragment thereof.
     We continue with introducing a general notion of a module in the sense that a module
of an ontology is not restricted to be a subset of the ontology. It is crucial, however, that
the module says everything (expressible in its signature) that is said by the ontology itself
(i.e. the ontology is required to be a conservative extension of the module)—see Fig. 3 .


                                  T                                     T2


                                                              σ : T1 −→ T2

                                                                        T1

                          subset of axioms   translation along signature morphism


                Fig. 3. Modules as subsets vs. modules as image under translation.


Definition 1. A theory morphism σ : O1 −→ O2 is consequence-theoretically con-
servative, if O2 does not entail anything new w.r.t. O1 , formally, O2 |= σ(ϕ) implies
O1 |= ϕ. Moreover, σ : O1 −→ O2 is model-theoretically conservative, if any O1 -model
M1 has a σ-expansion to O2 , i.e. a O2 -model M2 with M2 |σ = M1 .
Here, |= as usual denotes logical consequence, whereas _|σ denotes model reduct for
a signature morphism σ : Σ1 → Σ2 , i.e. for a Σ2 -model M2 , M2 |σ is a Σ1 -model that
interprets a symbol by first translating it along σ and then interpreting it using M2 .
    It is easy to show that conservative theory morphisms compose. Moreover, the notion
of model-theoretic conservativity is stronger than consequence-theoretic conservativity.
To be precise, the former implies the latter, but not vice versa [7]. The two notions
coincide if we define consequence-theoretic conservativity using Σ-theories that contain
consequences φ ∈ Sen(Σ) formulated in second-order logic.
    The computational complexity of deciding conservativity appears to be rather daunt-
ing even if the ontologies are formulated in weak logics. For instance, for ontologies
formulated in the light-weight Description Logic EL, deciding consequence-theoretic con-
servativity is ExpTime-complete, and model-theoretic conservativity is undecidable. The
former problem also becomes undecidable when adding nominals to ALCIQ, for which
it is still 2-ExpTime-complete [8]. This suggests that, for practical purposes and appli-
cations, we often have to live with approximations of these notions, more precisely with
sufficient (syntactic) conditions for conservativity that allow to construct non-minimal
modules. Indeed, the notion of an ontology module of an ontology O has been defined as
any “subontology O0 such that O is a conservative extension of O0 ” [1].
Definition 2 (Module Generator). Let O be an ontology in some fixed DL, and let
Σ ⊆ Sig(O) be a signature. Sen(Sig(O)) is the set of sentences in Sig(O). A function

                                Π : hO, Σi 7→ Sen(Sig(O))

mapping pairs hO, Σi consisting of an ontology O together with a signature Σ to a set of
sentences in Sig(O) is called a Σ-module generator if for all O and Σ:

          Π(hO, Σi) is a model-theoretic Sig(O)-conservative extension of O.

   For a Σ-module generator Π, the set Π(hO, Σi) is called a Σ-module for O.
Π(hO, Σi) is called Σ-covering for O if:

             O is a model-theoretic Σ-conservative extension of Π(hO, Σi).

    The idea to use (conservative) module generators is to massively reduce the size
of a colimit ontology to a merge of modules generated by the matched signatures and
preserving the semantics completely. This means that we can check the satisfiability of
our matched concepts (and the consistency of the overall merged ontology) already in a
rather small fragment of the overall ontology. Indeed, it is not hard to construct (or find)
cases where ontologies have a moderate semantic overlap, but where the overall merge
will be very hard to process by current tools.
    Before coming to our central theoretical result, we need some preparatory notions.
Definition 3. A diagram is a graph D of signatures (Di )i∈D and signature morphisms
(Dm : Di → Dj )m : i→j∈D . Given a diagram D, a family of models Mi ∈ Mod(Di )i∈D
is compatible, if Mj |Dm = Mi for any m : i → j ∈ D. A logic has the amalgamation
property w.r.t. a class of diagrams if for any diagram in the class, any compatible family
of models can be amalgamated to a unique model of the colimit of the diagram (i.e. such
that the reducts along the colimit injections yields the models of the family).
    A logic is semi-exact, if it has the amalgamation property for pushout diagrams.
    Note that colimits of theories can be easily defined in terms of signature colimits
and unions of (translated) axioms; the amalgamation property then carries over from
signature colimits to theory colimits (see [11], 4.4.17). We formulate the next results in
their full generality to cover also ontology languages other than DLs. But note that they
apply in particular to all usual DLs as these are semi-exact and moreover have an initial
signature (i.e. a signature with a unique signature morphism into any signature) [5].
Theorem 1 (Combination of two modules). Assume a semi-exact logic. Consider
Fig. 4, and assume that Oe and O are obtained by pushouts (with base Σ). Then O is a
model-theoretic conservative extension of O.
                                          e In particular, O is satisfiable iff O
                                                                                e is.
                                                               O
                                                              - 6

                                                O1                                  O2
                                                 6
                                                 c                    c?
                                                                                     6
                                                                                     c

                                                                  O
                                                                  e
                                                              -

                                            Π(O1 , Σ
                                                    1)                          Π(O2 , Σ2 )
                                                         σ1                     -
                                                                           σ2

                                                                  Σ


       Fig. 4. Propagation of modular structure through one matching (a ‘c’ denotes conservativity)


Proof. Let M f be a model of O.
                             e For i = 1, 2, let Mi be an Oi -expansion of the Π(O1 , Σi )-
reduct of M (which exists by conservativity of Oi over Π(O1 , Σi )). M1 and M2 obvi-
           f
ously agree on Σ, hence they have an amalgamation M which is an O-model. Now the
Π(O1 , Σi )-reducts of M agree with those of M f, hence by uniqueness of amalgamation,
the O-reduct
    e          of M is M
                       f.                                                                a

      We next generalise this result to the case of arbitrary matching configurations.

Theorem 2 (Combination of multiple modules). Assume a semi-exact logic with
an initial signature. Consider a family of ontologies (Oi )i∈I indexed by a finite non-empty
set I and a simple graph G ⊆ I × I, such that for (i, j) ∈ G, Oi and Oj are interfaced by
                                                θi,j                       θj,i
                                       Oi                Σi,j                      - Oj

      Define                                             [
                                             Σi :=                θi,j (Σi,j )                                   (1)
                                                       j∈I\{i}

and σi : Σi → Π(Oi , Σi ) the module in Oi for Σi . Let σi,j : Σi,j → Π(Oi , Σi ) be the
restriction of θi,j , namely θi,j : Σi,j → Σi composed with σi .13 Assume that O  e (resp.
O) is obtained by the colimit of the diagram of all σi,j (resp. all σi,j composed with the
inclusion of Π(Oi , Σi ) into Oi ). Then O is a model-theoretic conservative extension of
O.
 e In particular, O is satisfiable iff O e is.

Proof. Note that the diagrams for obtaining O   e resp. O can be turned into connected
diagrams by adding the inclusions of the empty signature into all involved theories (the
colimit does not change by this addition). The rationale is that this ensures in OWL
that all compatible model families are built over the same universe of individuals. More
formally, by Prop. 4.4.15 of [11], in any semi-exact logic with an initial signature, all
finite non-empty connected diagrams enjoy the amalgamation property. With this, the
proof is a straightforward generalisation of the proof of Thm. 1.                      a
13
     Consider Fig. 4, but replace Σ by Σi,j , σ1 by σi,j , σ2 by σj,i , O1 , Σ1 by Oi , Σi , and O2 , Σ2 by Oj , Σj .
    Note that Thm. 1 does not hold for consequence-theoretic conservativity. Consider
the following example, adapted from [7]. In ALCO, let O1 be
                IntroTCS                          v ∃has_subject.AutomataTheory
                IntroTCS                          v ∃has_subject.ComplexityTheory
                AutomataTheory u ComplexityTheory v ⊥

and O2 be
                            IntroTCS v ∀has_subject.{moore_automata}
                            IntroTCS v ∃has_subject.{moore_automata}
 Let Σ be {IntroTCS, has_subject}. Then assuming a consequence-theoretically conser-
vative (minimal) module generator, Π(O1 , Σ) = Π(O2 , Σ) = O
                                                           e is

                                     IntroTCS v ∃has_subject.>

But this is consistent, while O1 ∪ O2 is not (assuming IntroTCS is also instantiated).

3     Worked Out Example

The following worked out example serves two purposes: it should illustrate the last theo-
rem on combination of multiple modules, and it should demonstrate its practical impact.
    Our ontology repository ORATE14 contains among others three ontologies, namely:
SpatialRelations, ExtendedDnS, and FunctionalParticipation. Let us denote them
by O1 , O2 , and O3 resp., to be notationally conform with Theorem 2 above. We combine
them in different ways and show how consistency issues of the combined ontologies can
already be answered in the combination of modules. Automated matching resulted in the
concept identifications listed in the two tables below.

             SpatialRelations endurant          participant          place-of
             ExtendedDnS      physical-endurant participant-place-of situation-place-of
                    Table 1. Matching SpatialRelations against ExtendedDnS


Our first matching thus induces an interface hΣ12 , θ12 , θ21 i with

                            Σ12 = {endurant, participant, place-of},

θ12 = id, and θ21 = {endurant 7→ physical-endurant, participant 7→ paticipant-place-of,
place-of 7→ situation-place-of}.
    The second an interface hΣ23 , θ23 , θ32 i with Σ23 = {endurant, physical-object, region},
θ23 = {endurant 7→ non-physical-endurant, physical-object 7→ agentive-physical-object,
region 7→ space-region}, and θ32 = id.
    From Σ12 and Σ23 , the Σi ’s can be determined: Σ1 = θ12 Σ12 = {physical-endurant},
Σ2 = θ12 Σ12 ∪ θ23 Σ23 = {endurant}, and Σ2 = θ23 Σ32 = {non-physical-endurant}.
14
     http://ontologies.informatik.uni-bremen.de
       SpatialRelations        endurant              physical-object          region
       FunctionalParticipation non-physical-endurant agentive-physical-object space-region
            Table 2. Matching SpatialRelations against FunctionalParticipation


The signatures Σi , for i = 1, 2, 3, together with corresponding ontologies Oi are sent
to the module generator that gives us for each ontology the corresponding module
Mi := Π(Oi , Σi ). Finally, from the signatures Σi and the signature morphisms θij , the
colimit (i.e. the alignment) M̃ of the modules M1 , M2 , and M3 is obtained. Similarly,
the aligned ontology M̃ can be determined from the three ontologies O1 , O2 , and O3 .
A practical result of this alignment can be a check of the merged module M̃ for consis-
tency. As we would expect, the merge of the concept "physical-endurant", "endurant",
and "non-physical-endurant" into a single concept makes it unsatisfiable—this result can
be automatically verified by a prover. Since we know that Õ must be a conservative
extension of M̃ , it is in fact not necessary to build Õ and check its consistency: this
information can already be inferred from M̃ . To repair the detected inconsistency of M̃ ,
we can abandon either M1 or M3 from the alignment process. In both cases, M1 aligned
with M2 only, or M3 aligned with M2 only, the resulting merged module turns out to be
consistent, and we know by conservativity that the corresponding ontologies would be
consistent, too.

4     An Interoperability Workflow and Prototypical Implementation
4.1    The Component Tools
We implemented a workflow for aligning arbitrary matching configurations taking advan-
tage of several third-party tools. We here briefly introduce these tools and describe in the
subsequent subsection their interoperation. Our ontologies to be matched and aligned are
taken from our ontology repository ORATE which is being developed and maintained
within the EU-project OASIS15 . The software is based on BioPortal [10]. As matching
system we use Falcon [3] which matches OWL ontologies by means of linguistic and
structural analysis. Falcon can be comfortably used in a batch mode and thus makes it
suitable for a pipe workflow. For module extraction as well as consistency checks we use
Pellet [12] which in particular makes use of the OWL-API 16 . Finally, we use Hets 17 for
the computation of colimits (i.e. alignments). Hets is a parsing, static analysis and proof
management tool incorporating various provers and different specification languages, thus
providing a tool for heterogeneous specifications.

4.2    Workflow Description
Our workflow of multiple ontology alignment consists of two phases: 1) the preprocessing
of the whole repository (ORATE) to a complete list of pairwise matching records
15
   See http://www.oasis-project.eu
16
   See http://owlapi.sourceforge.net
17
   See www.informatik.uni-bremen.de/cofi/hets
and 2) the alignment of a connected subgraph of the hyperontology graph. It is up
to user what part of the hyperontology graph the user selects as connected subgraph.
The rationale behind this user interaction is to dismiss false matchings produced by
the matching system. Different matching configuration consequently lead to different
alignment outcomes.

preprocess_repository = {
  foreach (o1,o2) in repository
   match_ontologies o1 o2
  end
}

match_ontologies o1 o2 = {
  mapping = falcon o1 o2
  cs1 = extract_concepts_from_o1 mapping
  cs2 = extract_concepts_from_o2 mapping
  s12 = build_interface_signature cs1 cs2
  sm1 = build_interface_signature_morphism s12 cs1
  sm2 = build_interface_signature_morphism s12 cs2
  matching_record = (o1,o2,s12,sm1,sm2)
  if (not_empty mapping) store_in_hyperontology_graph matching_record
}

compute_module_graph hyperontology_subgraph = {
  foreach node in hyperontology_subgraph
    in_edges = edges incident with node
    sig = compute_interface_signature node in_edges
    module = compute_module node sig
  end
  replace nodes in hyperontology_subgraph by modules
}

align_modules hyperontology_subgraph = {
  module_graph = compute_module_graph hyperontology_subgraph
  interfaces = {compute_interface edge | edge in module_graph}
  views = {compute_views edge | edge in module_graph}
  spec = write_alignment_spec modules interfaces views
  alignment = compute_alignment spec % hets
}


                Fig. 5. Pseudo code of the workflow for multiple ontology alignment


    Fig. 5 shows the whole workflow in pseudo code. We are going to explain it now line
by line. Procedure preprocess_repository takes each pair of ontologies (o1,o2) and
applies the procedure match_ontologies to it, i.e., it matches pairwise all ontologies from
the repository. The output of match_ontologies is a matching_record: the matching
system Falcon computes the mapping between the two ontologies o1 and o2. From the
mapping the concepts cs1 (cs2) belonging to ontology o1 (o2) are extracted. Based on the
concepts cs1 (cs2), the interface signature s12 is built and the corresponding signature
morphisms sm1 and sm2 to the ontologies o1 and o2. The two ontologies, the interface
signature, and the two signature morphisms form the matching_record. This record
can be viewed as a link between two ontologies. We call this network whose nodes are
ontologies and whose edges are the matching records hyperontology graph. Although all
ontologies are matched pairwise, the graph is not complete, i.e., some pairs of ontologies
(in practice even the majority) are not linked, namely when the matching system cannot
find any mappings.
    Once the hyperontology graph of the ontology repository is computed we can choose
an arbitrary subgraph of it to compute the colimit of modules implicitly given in this sub-
graph. For that the procedure compute_module_graph takes the hyperontology subgraph
and basically replaces its ontologies by modules extracted from them. More precisely, for
each ontology (=node) it collects all interfaces (=in_edges) connected to this node and
computes (according to Equation 1 in Thm. 2) a signature (sig) that is finally used to
compute the module. Practically, this last step is delegated to Pellet (OWL-API).
    Aligning modules implicitly given in a hyperontology subgraph (cf. align_modules
procedure) comprises the following steps: we first transform the graph with the just
mentioned procedure compute_module_graph to the module_graph. From the module
graph we extract all signatures (=interfaces) and signature morphisms (views) to the
modules. From the modules, the interfaces, and the views we compose a specification
document (spec) that can be understood by Hets. Finally, Hets computes the colimit
(alignment) of this spec.


5   Related Work and an Outlook

Matching and revision in networks of ontologies seems to be a rather unexplored topic.
Although only studying pairs of ontologies, the most closely related work in spirit appears
to be [4], where a semi-automatic procedure is presented for the integration of ontologies
that involves revision of mappings. This approach is implemented as the Protégé 4 plugin
ContentMap. Here, a selected ontology matcher is used to compute mappings between
the signatures of two ontologies chosen for integration, the mappings are explicitly inter-
nalised as axioms in an OWL-2 ontology, and the result of the integration is then taken
to be the (disjoint) union of the original ontologies together with the mapping axioms.
The integration is assumed to be successful if a user does not identify unintended logical
consequences. This decision is guided by justifications (explanations for entailment) that
can automatically be computed, e.g., by ContentMap and the confidence values created
by the matcher for the mapping axioms. To eliminate the unintended consequences, a re-
pair plan is created describing which axioms from the original ontologies or the mapping
should be removed. It should be noted that such a plan does not always exist and that
a desired integration may require the iteration of these steps.
    The main differences to our approach are a) we support heterogeneity of ontology
languages, b) we do not internalise the mappings but make explicit the structure of
the alignment graph, c) we generically support arbitrary matching configurations and
use category-theoretic techniques to compute the merged ontologies without duplicating
matched signature items. Apart from these differences, the necessity for debugging and
revising matchings also applies to our approach, and the proposed techniques can be
made fruitful also in this setting.
    We could here only scratch the surface of the area of problems related to matching
in networks of ontologies. We have laid out the necessary engineering infrastructure to
combine matching, structuring, and reasoning tools, and obtained some theoretical results
concerning the reduction of consistency checks to merged modules.
    However, a lot of open questions remain. For instance, internalising mappings can be
extended to internalising the confidence values of mappings building on similarity-based
E-connections [2]. Moreover, a full statistical analysis of major ontologies and repositories
remains to be done to understand the impact of iterated matching on consistency.

Acknowledgements
Work on this paper has been supported by the DFG-funded collaborative research centre
SFB/TR 8 ‘Spatial Cognition’, the EU-funded OASIS project, and by the German Federal
Ministry of Education and Research (Project 01 IW 07002 FormalSafe). Dirk Walther is
supported by a ‘Juan de la Cierva’ postdoctoral fellowship.

References
 1. Ghilardi, S., Lutz, C., and Wolter, F. Did I Damage My Ontology? A Case for Conservative
    Extensions in Description Logics. In Proceedings of KR-06 (2006), pp. 187–197.
 2. Hois, J., and Kutz, O. Counterparts in Language and Space—Similarity and S-Connection. In
    Proc. of FOIS 2008 (2008), C. Eschenbach and M. Grüninger, Eds., IOS Press, pp. 266–279.
 3. Hu, W., and Qu, Y. Falcon-AO: A practical ontology matching system. In Proc. of WWW-07
    (2008), pp. 237–239.
 4. Jimenez Ruiz, E., Cuenca Grau, B., Horrocks, I., and Berlanga, R. Ontology Integration
    Using Mappings: Towards Getting the Right Logical Consequences. In Proc. of the 6th European
    Semantic Web Conference (ESWC 2009) (2009), LNCS, Springer.
 5. Kutz, O., and Mossakowski, T. Conservativity in Structured Ontologies. In 18th European
    Conf. on Artificial Intelligence (ECAI-08) (Patras, Greece, 2008), IOS Press.
 6. Kutz, O., Mossakowski, T., and Lücke, D. Carnap, Goguen, and the Hyperontologies: Logical
    Pluralism and Heterogeneous Structuring in Ontology Design. Logica Universalis 4, 2 (2010). Special
    issue on ‘Is Logic Universal?’.
 7. Lutz, C., Walther, D., and Wolter, F. Conservative Extensions in Expressive Description
    Logics. In Proc. of IJCAI-07 (2007), M. Veloso, Ed., AAAI Press, pp. 453–458.
 8. Lutz, C., and Wolter, F. Conservative Extensions in the Lightweight Description Logic EL. In
    Proc. of CADE-07 (2007), Springer, pp. 84–99.
 9. Normann, I., and Kutz, O. Ontology Reuse and Exploration via Interactive Graph Manipulation.
    In Proc. of the ISWC Workshop on Ontology Repositories for the Web (SERES-2010) (ISWC-2010,
    November 7, 2010, Shanghai, China), CEUR-WS.
10. Noy, N. F., Shah, N. H., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Montegut, M. J.,
    Rubin, D. L., Youn, C., and Musen, M. A. Bioportal: A web repository for biomedical ontologies
    and data resources [demonstration], 2007.
11. Sannella, D., and Tarlecki, A. Foundations of Algebraic Specifications and Formal Program
    Development. Springer Verlag. to appear.
12. Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., and Katz, Y. Pellet: A practical
    OWL DL reasoner. Journal of Web Semantics 5, 2 (2007), 51–53.
13. Zimmermann, A., Krötzsch, M., Euzenat, J., and Hitzler, P. Formalizing Ontology Align-
    ment and its Operations with Category Theory. In Proc. of FOIS-06 (2006), pp. 277–288.

</pre>