SITUATED SEMANTIC ALIGNMENT


                             Manuel Atencia                   Marco Schorlemmer

                             IIIA - Artificial Intelligence Research Institute
                              CSIC - Spanish National Research Council
                               Bellaterra (Barcelona), Catalonia, Spain
                                                       Abstract
          Ontology matching is currently a key technology to achieve the semantic alignment of ontological
      entities used by knowledge-based applications, and therefore to enable their interoperability in distributed
      environments such as multi-agent systems. Most ontology matching mechanisms, however, assume match-
      ing prior integration and rely on semantics that has been coded a priori in concept hierarchies or external
      sources. We present a formal model for a semantic alignment procedure that incrementally aligns differing
      conceptualisations of two or more agents relative to their respective perception of the environment or do-
      main they are acting in. It hence makes the situation in which the alignment occurs explicit in the model.
      We resort to Channel Theory to carry out the formalisation.


1    Introduction
An ontology is commonly defined as a specification of the conceptualisation of a particular domain. It
fixes the vocabulary used by knowledge engineers to denote concepts and their relations, and it constrains
the interpretation of this vocabulary to the meaning originally intended by knowledge engineers. As such,
ontologies have been widely adopted also in multi-agent systems. Ontology matching is commonly defined
as the process that computes the semantic relationships between entities of separate ontologies. With the
proliferation of ontologies used in multi-agent systems caused by different conceptualisations of even the
same domain —and their subsequent specification using varying terminology— ontology matching is now
seen as a necessary technology to achieve semantic alignment and to enable interoperability of separately
engineered knowledge-based agents [5, 11].
    Until recently, most ontology matching mechanisms developed so far have taken a classical functional
approach to the semantic heterogeneity problem, in which ontology matching is seen as a process taking
two or more ontologies as input and producing a semantic alignment of ontological entities as output [3].
Furthermore, matching often has been carried out at design-time, before integrating knowledge-based agents
or making them interoperate. This might have been successful for clearly delimited and stable domains but
is untenable and even undesirable for the kind of applications that are currently deployed on the Web. Web-
service composition, multi-agent communication and peer-to-peer information sharing are all of a highly
distributed, dynamic and open-ended nature, and they require ontology matching to be performed during
run-time. In addition, one can imagine many situations in which ontologies are not even open for inspection
(e.g., when they are based on commercially confidential information).
    Certainly, there exist efforts to efficiently match ontological entities at run-time, taking only those ontol-
ogy fragment that are necessary for the task at hand [10, 13, 9, 8]. However, the techniques used by these
systems to establish the semantic relationships between ontological entities, even though applied at run-time,
they still exploit a priori defined concept taxonomies as they are represented in the graph-based structures of
the ontologies to be matched, use previously existing external sources such as thesauri (e.g. WordNet) and
upper-level ontologies [7], or resort to additional background knowledge repositories or shared instances.
    We claim that semantic alignment is ultimately relative to the particular situation in which it is carried
out and that this situation should be made explicit and brought into the ontology matching mechanism.
Even two systems with identical conceptualisation capabilities and using exactly the same vocabulary to
specify their respective conceptualisations may fail to interoperate because of their differing perception of
the domain. Imagine a situation in which two agents are facing each other on a checker board. Agent A1
may conceptualise a figure on the board as situated on the “left”, while agent A2 may conceptualise the
same figure as situated on the “right”. Although the conceptualisation of “left” and “right” is done in exactly
in the same manner by both agents, they still will need to align their respective vocabularies if they want
to successfully communicate to each other actions that move figures on the checker board. Their semantic
alignment, however, will only be valid in the scope of their communication within this particular situation
or environment. The same agents situated differently may lead to a different alignment.
    This scenario is reminiscent to those in which a group of distributed agents adapt to form an ontology
and a shared lexicon in an emergent, bottom-up manner, with only local interactions and no central control
authority [12]. This sort of self-organised emergence of shared meaning is namely ultimately grounded on
the physical interaction of agents with the environment. In this paper, however, we address the case in which
agents are already endowed with a top-down engineered ontology (it can even be the same one), which they
do not adapt or refine, but for which they want to find the semantic relationships with separate ontologies of
other agents on the grounds of their communication within a specific situation. In particular, we provide a
formal model that formalises situated semantic alignment as a sequence of information-channel refinements
in the sense of Barwise and Seligman’s theory of information flow [1]. This theory is particularly useful
for our endeavour because it models the flow of information occurring in distributed systems due to the
particular situations —or tokens— that carry information.1 Analogously, the semantic alignment that will
allow information to flow ultimately will be carried by the particular situation agents are acting in. We
shall therefore consider a scenario with two or more agents situated in an environment. Each agent will
have its own viewpoint of the environment so that, if the environment is in a concrete state, both agents
may have different perceptions of this state. Because of these differences there may be a mismatch in
the meaning of the syntactic entities by which agents describe their perceptions (and which constitute the
agents’ respective ontologies). We state that these syntactic entities can be related according to the intrinsic
semantics provided by the existing relationship between the agents’ viewpoint of the environment. The
existence of this relationship is precisely justified by the fact that the agents are situated and observe the
same environment.


2     Semantic Alignment in an Environment
2.1     The Information Channel linking Agent Perspectives
Let us consider two agents A1 and A2 situated in an environment E.2 We associate a numerable set S of
states to E and, at any given instant, we suppose E to be in one of these states. We further assume that each
agent is able to observe the environment and has its own perception of it. This ability is faithfully captured
by a surjective function seei : S → Pi , where i ∈ {1, 2}. According to Channel Theory, information is only
viable where there is a systematic way of classifying some range of things as being this way or that. This
idea is captured in the notion of classification:

Definition 1 ([1]) A classification is a tuple A = htok(A), typ(A), |=A i where tok(A) is a set of tokens,
typ(A) is a set of types and |=A is a binary relation between tok(A) and typ(A). If a |=A α then a is said
to be of type α.

    For each agent Ai , i ∈ {1, 2}, we consider a classification Ai that models Ai ’s viewpoint of E. First,
tok(Ai ) is composed of Ai ’s perceptions of E states, that is, tok(Ai ) = Pi . Second, typ(Ai ) comprises the
syntactic entities by which Ai describes its perceptions, the ones constituting the ontology of Ai . Finally,
|=Ai synthesizes how Ai relates its perceptions with these syntactic entities.
    Now, in order to associate the environment E with a classification E we choose the power classification
of S as E. which is the classification whose set of types is 2S , whose tokens are the elements of S, and
for which a is of type α if a ∈ α. The reason for taking the power classification is because, in general,
there is no global conceptualisation of the environment and hence there are no sytactic entities that may play
the role of types for E. However, the set of types of the power classification includes all possible token
configurations potentially described by types. Thus tok(E) = S, typ(E) = 2S and e |=E ε iff e ∈ ε.
   1 We do not assume any knowledge of Channel Theory; we restate basic definitions and theorems in this paper, but any detailed

exposition of the theory is outside the scope of this paper.
   2 The generalization to any numerable set of agents is straightforward.
    In Channel Theory, the information flow between the components of a distributed system is modelled
in terms of a channel. The relationships between the components are expressed via infomorphisms which
provide a way of moving information between them:

Definition 2 ([1]) An infomorphism f : A → B from A to B is a contravariant pair of functions f =
hfˆ, fˇi, where fˆ : typ(A) → typ(B) and fˇ : tok(B) → tok(A), satisfying the following fundamental
property:
                                                   fˇ(b) |=A α              iff b |=B fˆ(α)
for each token b ∈ tok(B) and each type α ∈ typ(A).3

We omit the superscripts of informorphisms if no confusion arises. Types are usually denoted by greek
letters and tokens by latin letters so f (α) ≡ fˆ(α) and f (a) ≡ fˇ(a).

Definition 3 ([1]) A channel C consists of two infomorphisms {fi : Ai → C}i∈{1,2} with a common
codomain C, called the core of C. The tokens of C are called connections and a connection c is said to
connect the tokens fˇ1 (c) and fˇ2 (c).4

                                                                      typ(C)
                                                                   s   9  eKK
                                                            fˆ1 sss                      KK fˆ2
                                                                                              KK
                                                                 ss                             KK
                                                                s           
                                                              ss                                  K
                                          typ(A1 )                                  |=           typ(A2 )
                                                                                C                    
                                                                                                       
                                                                                                         
                                                                                                             |=A
                                         |=A1
                                                                     tok(C)                                  
                                                                                        K                          2

                                                                   sss                  KK
                                                                                            K                   
                                                                  ss                          KK
                                                           ssss fˇ1                            KK
                                                                                                  K% 
                                                            y                              fˇ2
                                          tok(A1 )                                                tok(A2 )

The information flow of our scenario is accurately described by the channel E = {fi : Ai → E}i∈{1,2}
defined as follows:
    • fˆi (α) = {e ∈ tok(E) | seei (e) |=Ai α} for each α ∈ typ(E),
    • fˇi (e) = seei (e) for each e ∈ tok(E),
where i ∈ {1, 2}. The fundamental property of the infomorphisms is fulfilled indeed:
                            fˇi (e) |=Ai α         iff       seei (e) |=Ai α                  {by definition of fˇi }
                                                   iff       e ∈ fˆi (α)                      {by definition of fˆi }
                                                   iff       e |=E fˆi (α)                    {by definition of |=E }

In this way, E is the core of the channel E and a state e ∈ tok(E) connects the agents’ perceptions fˇ1 (e) and
fˇ2 (e).
                                                        E
                                                f1 }}
                                                      }> `AAA f
                                                   }         AA 2
                                                 }}            AA
                                               }}                A
                                            A1                    A2

       Example. This example is taken from [2] and we will resort to it throughout the paper. There
       are two observers, M r.1 and M r.2, each having a partial viewpoint of a box. This box consists
       of six sectors and each sector can enclose a ball. The box is “magic” because observers are
       not able to distinguish the depth inside it. Figure 1 shows the scenario we are describing and
       illustrates schematically what M r.1 and M r.2 can observe.
   3 Strictly speaking, an infomorphism f comprehends on the one hand a pair of classifications hA, Bi and on the other hand a

contravariant pair of functions hfˆ, fˇi defined as above.
   4 Actually, this is the definition of a binary channel. A channel can be defined with an arbitrary index set as well as a distributed

system can have an arbitrary number of components.
                                                                      The magic box   Mr.1's viewpoint


                                                                                      Mr.2's viewpoint


                                                    Mr. 2
       Mr. 1


                                 Figure 1: The magic box scenario


With respect to our approach, the magic box will play the role of E and M r.1 and M r.2 will
be the agents. The states of E are the possible configurations of the box so a state e can be
represented as a 3 × 2 binary matrix (eij ). In this way:
                                     (                  !                      )
                                           e11   e12
                              S=           e21   e22        | eij ∈ {0, 1}
                                           e31   e32
                                    !
                             1 0
Intuitively, the state       1 0 , for instance, means that there are only two balls in the box,
                             0 0
one in the left up sector and another in the left centered sector. Notice that M r.1 sees a ball on
the left if there is a ball in at least one of the sectors of the left column of the box and sees a ball
on the right provided that there is ball in at least one of the sectors of the right column. Then
M r.1’s perceptions of the states of E can be represented as two-dimensional binary vectors,
and the function see1 can be defined formally as follows:
                                              !
                                        0 0
                 (0 0) if e =            0 0
             
             
             
                                         0 0
             
             
see1 (e) =       (1 0)    if   e 11 +  e21 + e31 ≥ 1 and e12 + e22 + e32 = 0
             
                 (0 1)    if   e    +  e   + e31 = 0 and e12 + e22 + e32 ≥ 1
             
                                 11     21
             
             
             
                 (1 1) if e11 + e21 + e31 ≥ 1 and e12 + e22 + e32 ≥ 1
             

where e ∈ S. The function see2 can be defined analogously on three-dimensional binary vectors
as representations of M r.2’s perceptions. Note that M r.1 has the notions of a ball being on the
left or on the right and so has M r.2, but this last one has also the notion of a ball being in the
center. After these considerations, A1 is defined by:
                   
    • tok(A1 ) = (1 1), (1 0), (0 1), (0 0)
   • typ(A1 ) = {left, right}
       (v1 v2 ) |=A1 left      iff   v1 = 1
   •
       (v1 v2 ) |=A1 right     iff   v2 = 1
On the other hand, A2 is defined by:
                 
  • tok(A2 ) = (1 1 1), (1 1 0), (1 0 1), (0 1 1), (1 0 0), (0 1 0), (0 0 1), (0 0 0)
   • typ(A2 ) = {left, center, right}
       (v1 v2 v3 ) |=A2 left         iff   v1 = 1
   •   (v1 v2 v3 ) |=A2 center       iff   v2 = 1
       (v1 v2 v3 ) |=A2 right        iff   v3 = 1

Finally, a channel E = {fi : Ai → E}i∈{1,2} as defined above explains the information flow of
this scenario.
2.2    The Logic of the Semantic Alignment
In the previous subsection, we have shown how channel E explains the information flow of our scenario.
Now, we want to relate the agents’ syntactic entities, i.e., the agents’ types, and obtain meaningful relation-
ships justified by E.
    In Channel Theory, the sum operation gives us a way of putting the two components of a channel
C = {fi : Ai → C}i∈{1,2} together into a single classification A1 + A2 (see Definition 4), and the
infomorphisms together into a single infomorphism f = f1 + f2 (see Proposition 1).

Definition 4 ([1]) Let A1 and A2 be two classifications. The sum of A1 and A2 denoted by A1 + A2 is
the classification with tok(A1 + A2 ) = tok(A1 ) × tok(A2 ) = {ha, bi | a ∈ tok(A1 ) and b ∈ tok(A2 )},
typ(A1 + A2 ) = typ(A1 ) ] typ(A2 ) = {hi, γi | i = 1 and γ ∈ typ(A1 ) or i = 2 and γ ∈ typ(A2 )} and
relation |=A1 +A2 defined by:
                                               ha, bi |=A1 +A2 h1, αi           if    a |=A1 α
                                               ha, bi |=A1 +A2 h2, βi           if    b |=A2 β

There exist two infomorphisms σ1 : A1 → A1 + A2 and σ2 : A2 → A1 + A2 defined in a natural
way: on types, σ1 (α) = h1, αi and σ2 (β) = h2, βi; on tokens, σ1 (ha, bi) = a and σ2 (ha, bi) = b. It is
straightforward that σ1 and σ2 are infomorphisms.

Proposition 1 ([1]) Given a channel C = {fi : Ai → C}i∈{1,2} , there exists a unique informorphism
f = f1 + f2 such that the following diagram commutes5 :

                                                                 : C dI
                                                               uu O III f
                                                            uuu
                                                             f1           II 2
                                                           u                II
                                                         uu
                                                                    f
                                                        u                      II
                                                       u
                                                    A1       / A1 + A2  o         A2
                                                              σ1                 σ2


PROOF : We just give the definition of f (see [1] for the rest of the proof): on types, f (h1, αi) = f1 (α) and
f (h2, βi) = f2 (β); on tokens, f (c) = hf1 (c), f2 (c)i.                                                    

    Note that typ(A1 + A2 ) is the disjoint union of typ(A1 ) and typ(A2 ). We attach importance to take the
disjoint union because A1 and A2 could use some identical types in order to describe their perceptions of E
(for instance, in the Magic Box example, left and right are both M r.1’s types and M r.2’s types). Therefore,
A1 + A2 is the place where we should look for relationships between agents’ types. Moreover, as we will
see f = f1 + f2 is of use to find meaningful relationships justified by E.
    Channel Theory provides a way to make these relationships explicit in a logical fashion by means of
“theories” and “local logics”:

Definition 5 ([1]) Given a set Σ, a sequent of Σ is a pair hΓ, ∆i of subsets of Σ. A binary relation ` between
subsets of Σ is called a consequence relation on Σ. A theory T is a pair hΣ, `i where ` is a consequence
relation on Σ. A sequent hΓ, ∆i of Σ for which Γ ` ∆ is called a constraint of the theory T .

     When it comes to writing constraints of a theory T = hΣ, `i, we will use the usual notational conven-
tions. For example, we write α ` β, γ instead of {α} ` {β, γ} and Γ, Γ0 ` ∆, α instead of Γ∪Γ0 ` ∆∪{α}.
     A theory is regular if it satisfies the properties Identity, Weakening and Global Cut (see [1]). Regu-
larity refers to the purely structural properties that any theory must satisfy. From here on, all the theories
considered are regular. Every classification generates a theory in a natural way:

Definition 6 ([1]) Let A be a classification. A token a ∈ tok(A) satisfies a sequent hΓ, ∆i of typ(A)
provided that if a is of every type in Γ then it is of some type in ∆. The theory generated by A, written
T h(A), is the theory htyp(A), `A i where Γ `A ∆ if every token in A satisfies hΓ, ∆i.

    The notion of local logic puts a classification and a theory together but with an important added twist.
In order to model reasonable but unsound inferences, the notion of‘ “normal token” is introduced.

  5 That is, f                     ˆ     ˆ           ˇ            ˇ
                 i = f ◦ σi , i.e., fi = f ◦ σˆi and fi = σˇi ◦ f , for i ∈ {1, 2}.
Definition 7 ([1]) A local logic L is a tuple htok(L), typ(L), |=L , `L , NL i where:
   1. htok(L), typ(L), |=L i is a classification denoted by Cla(L),

   2. htyp(L), `L i is a regular theory denoted by T h(L),
   3. NL is a subset of tok(L), called the normal tokens of L, which satisfy all the constraints of T h(L).

Definition 8 ([1]) A local logic L is sound if every token in Cla(L) is normal, that is, NL = tok(L). L is
complete if every sequent of typ(L) satisfied by every normal token is a constraint of T h(L).

Just like with theories, a classification generates a local logic in a natural way:

Definition 9 ([1]) Given a classification A, the local logic generated by A, written Log(A), is the local
logic on A, i.e., Cla(Log(A)) = A, such that T h(Log(A)) = T h(A) and all its tokens are normal, i.e.,
NLog(A) = tok(A).

The local logic generated by a classification is sound and complete. Furthermore, given a classification A
and a local logic L on A, L is sound and complete if and only if L = Log(A) (see [1]). The main notion of
Channel Theory is the notion of “distributed logic”:

Definition 10 ([1]) Let f : A → B be an infomorphism. Given a local logic L on B, the inverse image of
L under f , denoted f −1 [L], is the local logic on A with Γ `f −1 [L] ∆ if f (Γ) `L f (∆) and Nf −1 [L] = {a ∈
tok(A) | a = fˇ(b) for some b ∈ NL }. Given a channel C = {fi : Ai → C}i∈{1,2} and a local logic L on
C, the distributed logic of C generated by L, written DLogC (L), is the logic f −1 [L] where f = f1 + f2 .

DLogC (L) represents the reasoning about relations among the components of the channel C justified by the
logic L on the core. Consider L = Log(C). We denote DLogC (Log(C)) by Log(C). Log(C) then captures
in a logical fashion the information flow inherent in the channel.
    Now, let us return to our scenario modeled by E. Log(E) explains the relationships between the agents’
viewpoints of the environment. On the one hand, T h(Log(E)) contains the most meaningful relationships
between agents’ types according to the channel E. On the other hand, NLog(E) contains those pairs of agents’
perceptions connected together by the environment state of which they come from.

                                                      : E dI
                                                    uu O III f
                                               f1 uuu        II 2
                                               u               II
                                             uu
                                                         f
                                            u                     II
                                           u
                                        A1       / A1 + A2 o         A2
                                                 σ1             σ2


Log(E) is complete since Log(E) is complete but it is not necessarily sound because although Log(E) is
sound, f = f1 + f2 is not token surjective in general (see [1]):
                         NLog(E) = {hfˇ1 (e), fˇ2 (e)i | e ∈ tok(E)} ⊆ tok(A1 + A2 )

If Log(E) is also sound then Log(E) = Log(A1 + A2 ), as we comment above. It means that there is
no meaningful compatibility between agents’ viewpoints of the environment according to E. It is just the
fact that Log(E) is unsound what allows a meaningful compatibility between the agents’ viewpoints. This
compatibility relation is expressed at the type level in terms of constraints by T h(Log(E)) and at the token
level in terms of pairs of tokens by NLog(E) .

      Example. Following the previous example, Log(E) includes among its constraints:

                                h1, lefti `Log(E) h2, lefti, h2, centeri, h2, righti
                               h1, righti `Log(E) h2, lefti, h2, centeri, h2, righti

      Let us check the first constraint (the second one can be verified in a similar way). We have to
      prove:

                        fˆ(h1, lefti) `Log(E) fˆ(h2, lefti), fˆ(h2, centeri), fˆ(h2, righti)
       Let e = (eij ) ∈ tok(E) and assume that e |=E fˆ(h1, lefti). Then e ∈ fˆ(h1, lefti), that
       is, see1 (e) |=A1 left. Therefore e11 + e21 + e31 ≥ 1 and we can assure that there exists
       i ∈ {1, 2, 3} such that ei1 = 1. Hence ei1 + ei2 ≥ 1. If i = 3 then e |=E fˆ(h2, lefti), if i = 2
       then e |=E fˆ(h2, centeri) and if i = 1 then e |=E fˆ(h2, righti).
       The necessary and sufficient conditions for a token ha1 , a2 i of A1 + A2 being normal are the
       following:

                                         if a1 6= (0 0) then a2 6= (0 0 0)
                                         if a2 6= (0 0 0) then a1 6= (0 0)


3     Aligning Agent Perspectives through Communication
We have stated that T h(Log(E)) contains the most meaningful relations between agents’ types. The problem
is that neither agent can make use of this theory because they do not know the channel E completely. We
present a method by which agents obtain approximations to T h(Log(E)). These approximations gradually
become more reliable as the method is applied.
     A1 and A2 communicate by exchanging information about their perceptions of the environment states.
This information is expressed in terms of their own classification relations. Specifically, if E is in a con-
crete state e, we assume that agents can convey to each other which types are satisfied by their respective
perceptions of e and which are not. This exchange generates a channel C = {fi : Ai → C}i∈{1,2} .
Furthermore, T h(C) provides the logical relationships between the agents’ types justified by the fact that
agents have observed e. If E turns to another state e0 and agents proceed as before, another channel
C 0 = {fi0 : Ai → C0 }i∈{1,2} gives account of the new situation. The significant point is that C 0 refines
C (see Definition 11). Theorem 1 ensures that the refined channel involves more reliable information.
     The communication ends when agents have observed all the environment states. Again this situation
can be modeled by a channel, call it C ∗ = {fi∗ : Ai → C∗ }i∈{1,2} . Theorem 2 states that T h(C∗ ) =
T h(Log(E)).

Definition 11 ([1]) Let C = {fi : Ai → C}i∈{1,2} and C 0 = {fi0 : Ai → C0 }i∈{1,2} be two information
channels with the same component classifications A1 and A2 . A refinement infomorphism r from C 0 to C is
an infomorphism r : C 0 → C such that for each i ∈ {1, 2}, fi = r ◦ fi0 (i.e., fˆi = r̂ ◦ fˆi0 and fˇi = fˇi0 ◦ ř).
The channel C 0 is a refinement of C if there exists a refinement infomorphism r from C 0 to C.

Theorem 1 asserts that the more refined channel gives more reliable information. Although its theory has
less constraints, it has more normal tokens to which they apply.

Theorem 1 Let C = {fi : Ai → C}i∈{1,2} and C 0 = {fi0 : Ai → C0 }i∈{1,2} be two channels. If C 0 is a
refinement of C then:

    1. T h(Log(C 0 )) ⊆ T h(Log(C))
    2. NLog(C 0 ) ⊇ NLog(C)

PROOF : Since C 0 is a refinement of C then there exists a refinement infomorphism r from C 0 to C. Let us
write A = A1 + A2 , f = f1 + f2 and f 0 = f10 + f20 .

    1. Let Γ and ∆ be subsets of typ(A) and assume that Γ `Log(C 0 ) ∆, which means fˆ0 [Γ] `C0 fˆ0 [∆]. We
       have to prove Γ `Log(C) ∆, or equivalently, fˆ[Γ] `C fˆ[∆]. We proceed by reductio ad absurdum.
       Suppose c ∈ tok(C) does not satisfy the sequent hfˆ[Γ], fˆ[∆]i. Then c |=C fˆ(γ) for all γ ∈ Γ and
       c 6|=C fˆ(δ) for all δ ∈ ∆. Let us choose an arbitrary γ ∈ Γ. We have that γ = hi, αi for some
       α ∈ typ(Ai ) and i ∈ {1, 2}. Thus fˆ(γ) = fˆ(hi, αi) = fˆi (α) = r̂ ◦ fˆi0 (α) = r̂(fˆi0 (α)). Therefore:

                                        c |=C fˆ(γ)    iff   c |=C r̂(fˆi0 (α))
                                                       iff   ř(c) |=D fˆi0 (α)
                                                       iff   ř(c) |=D fˆ0 (hi, αi)
                                                       iff   ř(c) |=D fˆ0 (γ)
      Consequently, ř(c) |=C fˆ0 (γ) for all γ ∈ Γ. Since fˆ0 [Γ] `C0 fˆ0 [∆] then there exists δ ∗ ∈ ∆ such
      that ř(c) |=C0 fˆ0 (δ ∗ ). A sequence of equivalences similar to the above one justifies c |=C fˆ(δ ∗ ),
      contradicting that c is a counterexample to hfˆ[Γ], fˆ[∆]i. Hence Γ `Log(C) ∆ as we wanted to prove.
   2. Let ha1 , a2 i ∈ tok(A) and assume ha1 , a2 i ∈ NLog(C) . Therefore, there exists c token in C such
      that ha1 , a2 i = fˇ(c). Then we have ai = fˇi (c) = fˇi0 ◦ ř(c) = fˇi0 (ř(c)), for i ∈ {1, 2}. Hence
      ha1 , a2 i = fˇ0 (ř(c)) and ha1 , a2 i ∈ NLog(C 0 ) . Consequently, NLog(C 0 ) ⊇ NLog(C) which concludes
      the proof.
                                                                                                                   

Refining the Semantic Alignment
We assume that typ(Ai ) is finite for i ∈ {1, 2} and S is infinite numerable, though the finite case can be
treated in a similar form. We also choose an infinite numerable set of symbols {cn | n ∈ N}6 .
    Agent communication starts from the observation of E. Let us suppose that E is in state e1 ∈ S =
tok(E). A1 ’s perception of e1 is f1 (e1 ) and A2 ’s perception of e1 is f2 (e1 ). We take for granted that A1
can communicate A2 those properties that are and are not satisfied by f1 (e1 ) according to its classification
A1 . So can A2 do. Since both typ(A1 ) and typ(A2 ) are finite, this process eventually finishes. After this
communication a channel C 1 = {fi1 : Ai → C1 }i=1,2 arises.

                                                                 1
                                                              = C aCC 1
                                                      f11  {{{       CC f2
                                                         {{            CC
                                                       {{                CC
                                                      {
                                                 A1                           A2

On the one hand, C1 is defined by: tok(C1 ) = {c1 }, typ(C1 ) = typ(A1 + A2 ) and c1 |=C1 hi, αi if
fi (e1 ) |=Ai α, for α ∈ typ(Ai ) and i ∈ {1, 2}. On the other hand, fi1 (α) = hi, αi and fi1 (c1 ) = fi (e1 ),
for α ∈ typ(Ai ) and i ∈ {1, 2}. Clearly, f11 and f21 are infomorphims.
     Note in this case both agents know C1 . Hence they can compute separately T h(C1 ) = htyp(C1 ), `C1 i.
This theory represents the logical relationships between the agents’ types justified by the fact that agents
have observed e1 .
                                                                           
                                                                       1 0
        Example. Assume that the environment is in state e1 =  1 0 . A1 ’s perception of e1 is
                                                                       0 0
        (1 0) and A2 ’s perception of e1 is (0 1 1). T h(C1 ) includes among its constraints:

                             h1, lefti `C1 h2, centeri         and      h1, lefti `C1 h2, righti

     Now, let us assume that E turns to a new state e2 . Agents can proceed as above exchanging this time
information about their perceptions of e2 . Therefore, another channel C 2 = {fi2 : Ai → C2 }i∈{1,2} comes
up. First of all, C2 is defined by: tok(C2 ) = {c1 , c2 }, typ(C2 ) = typ(A1 + A2 ) and for 1 ≤ k ≤ 2,
ck |=C2 hi, αi if fi (ek ) |=Ai α, for all α ∈ typ(Ai ) and i ∈ {1, 2}. Second, fi2 (α) = hi, αi and
fi2 (ck ) = fi (ek ) again for 1 ≤ k ≤ 2, α ∈ typ(Ai ) and i ∈ {1, 2}. f12 and f22 are infomorphims by
definition. The theory T h(C2 ) = htyp(C2 ), `C2 i represents the logical relationships between the agents’
types justified by the fact that agents have observed e1 and e2 . The significant point is that C 2 is a refinement
of C 1 :
                                                          C2 aB
                                                 f12 ||
                                                       |=       BB f 2
                                                                  B 2
                                                    ||       f 1 BBB
                                                 ||                   B
                                                |          
                                            A1          / C1 o          A2
                                                     1             1
                                                        f1            f2

It is easy to prove that f 1 defined as the identity function on types and the inclusion function on tokens is a
refinement infomorphism. By Theorem 1, C2 constraints are more reliable than C1 constraints.
  6 We write these symbols with superindexes because we limit the use of subindexes for what concerns to agents.
                                                                                                !
                                                                                        0   0
                                                                               2
       Example. Let us assume that the environment changes to the state e =             0   1    . Then A1 ’s
                                                                                        1   0
       perception of e2 is (1 1) and A2 ’s perception of e2 is (1 1 0). In this case:

                            h1, lefti `C2 h2, centeri    but    h1, lefti 0C2 h2, righti

     In the general situation, once the states e1 , e2 , . . . , en have been observed and a new state en+1 appears,
the channel C n+1 = {fin+1 : Ai → Cn+1 }i∈{1,2} informs about agents communication up to this moment.
First, tok(Cn+1 ) = {c1 , c2 , . . . , cn+1 }, typ(Cn+1 ) = typ(A1 + A2 ) and for each 1 ≤ k ≤ n + 1,
ck |=Cn+1 hi, αi if fi (ek ) |=Ai α, for all α ∈ typ(Ai ) and i ∈ {1, 2}. Second, fin+1 (α) = hi, αi and
fin+1 (ck ) = fˇi (ek ) for 1 ≤ k ≤ n + 1, α ∈ typ(Ai ) and i ∈ {1, 2}. Clearly, f1n+1 and f2n+1 are
infomorphims. The theory T h(Cn+1 ) = htyp(Cn+1 ), `Cn+1 i represents the logical relationships between
the agents’ types justified by the fact that agents have observed e1 , . . . , en , en+1 . Also C n+1 is a refinement
of C n . f n defined as the identity function on types and the inclusion function on tokens is a refinement
infomorphism. By Theorem 1, Cn+1 constraints are more reliable than Cn constraints.
     Remember we have assumed that S is infinite numerable. It is therefore unpractical to let communication
finish when all environment states have been observed by A1 and A2 . At that point, the family of channels
{C n }n∈N would inform of all the communication stages. It is therefore up to the agents to decide when
to stop communicating should a good enough approximation have been reached for the purposes of their
respective tasks. But the study of possible termination criteria is outside the scope of this paper and left for
future work. From a theoretical point of view, however, we can consider the channel C ∗ = {fi : Ai →
C∗ }i∈{1,2} which informs of the end of the communication after observing all environment states. C∗ is
defined as follows: tok(C∗ ) = {cn | n ∈ N}, typ(C∗ ) = typ(A1 + A2 ) and for n ∈ N, cn |=C∗ hi, αi if
fi (en ) |=Ai α, for all α ∈ typ(Ai ) and i ∈ {1, 2}. In addition, fi∗ (α) = hi, αi and fi∗ (cn ) = fi (en ) for
n ∈ N, α ∈ typ(Ai ) and i ∈ {1, 2}. By definition, f1∗ and f2∗ are infomorphims.
     Theorem 2 below is the cornerstone of the theory exposed in this paper. In short, it ensures that at each
communication stage agents obtain a theory that approximates the most meaningful relationships between
their types:

Theorem 2 The following statements hold:
    1. For all n ∈ N, C ∗ is a refinement of C n .
    2. T h(Log(E)) = T h(C∗ ).

PROOF :

    1. It is easy to prove that for each n ∈ N, g n defined as the identity function on types and the inclusion
       function on tokens is a refinement infomorphism from C ∗ to C n .
    2. It follows directly from:

                       cn |=C∗ hi, αi     iff   fˇi (en ) |=Ai α     {by definition of |=C∗ }
                                          iff   en |=E fˆi (α)       {because fi is infomorphim}
                                          iff   en |=E fˆ(hi, αi)    {by definition of fˆ}

                                                                                                                  
4     Conclusions
In this paper we have exposed a formal model of semantic alignment as a sequence of information-channel
refinements that are relative to the particular states of the environment in which two agents communicate and
align their respective conceptualisations of these states. Before us, Kent [6] and Kalfoglou and Schorlemmer
[4, 10] have applied Channel Theory to formalise semantic alignment using also Barwise and Seligman’s in-
sight to focus on tokens as the enablers of information flow. Their approach to semantic alignment, however,
like most ontology matching mechanisms developed to date (regardless of whether they follow a functional,
design-time-based approach, or an interaction-based, run-time-based approach), still defines semantic align-
ment in terms of a priori design decisions such as the concept taxonomy of the ontologies or the external
sources brought into the alignment process. Instead the model we have presented in this paper makes explicit
the particular states of the environment in which agents are situated and are attempting to gradually align
their ontological entities. In the future we plan to further refine the model presented here (e.g., to include
pragmatic issues such as termination criteria for the alignment process) and to devise concrete ontology
negotiation protocols based on this model that agents may be able to enact.


Acknowledgements
This work is supported under the UPIC project, sponsored by Spain’s Ministry of Education and Science
under grant number TIN2004-07461-C02- 02 and under the OpenKnowledge Specific Targeted Research
Project (STREP), sponsored by the European Commission under contract number FP6-027253. M. Schor-
lemmer is also supported by a Ramón y Cajal Research Fellowship from Spain’s Ministry of Education and
Science, partially funded by the European Social Fund.


References
 [1] J. Barwise and J. Seligman. Information Flow: The Logic of Distributed Systems. Cambridge Univer-
     sity Press, 1997.
 [2] C. Ghidini and F. Giunchiglia. Local models semantics, or contextual reasoning = locality + compati-
     bility. Artificial Intelligence, 127(2):221–259, 2001.
 [3] F. Giunchiglia and P. Shvaiko. Semantic matching. The Knowledge Engineering Review, 18(3):265–
     280, 2004.
 [4] Y. Kalfoglou and M. Schorlemmer. IF-Map: An ontology-mapping method based on information-flow
     theory. In Journal on Data Semantics I, LNCS 2800, 2003.
 [5] Y. Kalfoglou and M. Schorlemmer. Ontology mapping: The sate of the art. The Knowledge Engineer-
     ing Review, 18(1):1–31, 2003.

 [6] R. E. Kent. Semantic integration in the Information Flow Framework. In Semantic Interoperability
     and Integration, Dagstuhl Seminar Proceedings 04391, 2005.
 [7] D. Lenat. CyC: A large-scale investment in knowledge infrastructure. Communications of the ACM,
     38(11), 1995.
 [8] V. López, M. Sabou, and E. Motta. PowerMap: Mapping the real semantic web on the fly. Accepted
     to be presented at ISWC’06, 2006.
 [9] F. McNeill. Dynamic Ontology Refinement. PhD thesis, School of Informatics, The University of
     Edinburgh, 2006.
[10] M. Schorlemmer and Y. Kalfoglou. Progressive ontology alignment for meaning coordination: An
     information-theoretic foundation. In 4th Int. Joint Conf. on Autonomous Agents and Multiagent Sys-
     tems, 2005.
[11] P. Shvaiko and J. Euzenat. A survey of schema-based matching approaches. In Journal on Data
     Semantics IV, LNCS 3730, 2005.
[12] L. Steels. The Origins of Ontologies and Communication Conventions in Multi-Agent Systems. In
     Journal of Autonomous Agents and Multi-Agent Systems, 1(2), 169-194, 1998.
[13] J. van Diggelen et al. ANEMONE: An Effective Minimal Ontology Negotiation Environment In 5th
     Int. Joint Conf. on Autonomous Agents and Multiagent Systems, 2006