=Paper= {{Paper |id=None |storemode=property |title=Looking for Analogical Proportions in a Formal Concept Analysis Setting |pdfUrl=https://ceur-ws.org/Vol-959/paper20.pdf |volume=Vol-959 |dblpUrl=https://dblp.org/rec/conf/cla/MicletPG11 }} ==Looking for Analogical Proportions in a Formal Concept Analysis Setting== https://ceur-ws.org/Vol-959/paper20.pdf
             Looking for analogical proportions
            in a formal concept analysis setting

              Laurent Miclet1 , Henri Prade2 , and David Guennec1
1
    IRISA-ENSSAT, Lannion, France, miclet@enssat.fr, david.guennec@gmail.com,
         2
           IRIT, Université Paul Sabatier, Toulouse, France, prade@irit.fr



       Abstract. Categorization and analogical reasoning are two important
       cognitive processes, for which there exist formal counterparts (at least
       they may be regarded as such): namely, formal concept analysis on the
       one hand, and analogical proportions (modeled in propositional logic)
       on the other hand. This is a first attempt aiming at relating these two
       settings. The paper presents an algorithm that takes advantage of the
       lattice structure of the set of formal concepts for searching for analogical
       proportions that may hold in a formal context. Moreover, properties
       linking analogical proportions and formal concepts are laid bare.


1     Introduction
Categorization and analogical reasoning play important roles in cognitive pro-
cesses. They both heavily rely on the ideas of similarity and dissimilarity. Items
belonging to the same category should be similar, while they are dissimilar with
respect to items belonging to other categories. Analogical proportions, which
are statements of the form ‘a is to b as c is to d’, express the similarity of the
relations linking a and b with the relations linking c and d (note that however
a and b may be somewhat dissimilar (as well as c and d). In a Boolean setting,
where items are described in terms of binary attributes, similarity amounts to
the identity of properties, while dissimilarity refers to the presence of properties
for an item which are absent in the other considered item.
    Among formal approaches aiming at categorizing items, Formal Concept
Analysis (FCA) provides a way for characterizing concepts both extensionally in
terms of the objects that they cover and intensionally in terms of the properties
that these objects share. FCA is known as a lattice-theoretic framework devised
for knowledge extraction from Boolean data tables called formal contexts that
relate objects and properties. Introduced under this name by Wille [13], FCA
has been developed by Ganter and Wille [7] and their followers for thirty years.
    Besides, there has been a renewal of interest for analogical proportions in the
last decade, firstly in relation with computational linguistic concerns. Set-based,
algebraic and logical models have been proposed [8, 12, 1, 9]. In the following,
we more particularly use the Boolean view [9] of analogical proportions that is
directly relevant for application to formal contexts. Then, it makes sense to look
for analogical proportions in Boolean contexts, and to try to understand what
formal concepts and analogical proportions may have in common.
    The paper is organized as follows. We first provide a short background on
analogical proportions in Section 2. Then in Section 3, after a brief reminder of
basic definitions in FCA, we present an efficient algorithm able to discover ana-
logical proportions in a formal context by using the lattice of formal concepts. In
Section 4, we further investigate the theoretical relations between FCA and ana-
logical proportions, by showing how formal concepts are involved in analogical
proportions, before indicating lines for further research and concluding.


2   Analogical proportions

An analogical proportion ‘a is to b as c is to d’, usually denoted a : b :: c : d,
expresses that the way a and b differ is the same as the way c and ddiffer [9].
This leads to the following definitions, here stated for three closely related kinds
of items: subsets of a finite set, Boolean truth values, and objects defined by
Boolean properties (also called binary attributes).


Analogical proportion between sets First, let us consider four sets A, B, C
and D, all subsets of some set X. The dissimilarity between A and B is evaluated
by A ∩ B and by A ∩ B, where A denotes the complement of A in X, while the
similarity corresponds to A ∩ B and A ∩ B. Viewing an analogical proportion as
expressing that the differences between A and B and between C and D are the
same, we get the following definition [9]:

Definition 1 Four subsets A, B, C and D of a finite set X are in analogical
proportion in this order when A ∩ B = C ∩ D and A ∩ B = C ∩ D.


Analogical proportion between Boolean objects This expression has an
immediate logical counterpart when a, b, c, and d now denote Boolean variables:

                  ((a ∧ ¬b) ≡ (c ∧ ¬d)) ∧ ((¬a ∧ b) ≡ (¬c ∧ d))

This formula is true for the 6 truth value assignments of a, b, c, d appearing in
Table 1, and is false for the 24 − 6 = 10 remaining possible assignments.
     It can be checked that the above definitions of an analogical proportion sat-
isfies the following characteristic postulates [8]:

 – a : b :: a : b (identity)
 – a : b :: c : d =⇒ c : d :: a : b (global symmetry)
 – a : b :: c : d =⇒ a : c :: b : d (central permutation)
 – a : b :: c : d and ¬(b : a :: c : d) are consistent (local dissymmetry)
                                      a ×        ×    ×
                                      b ×    ×        ×
                                      c ×        ××
                                      d ×    ×    ×
Table 1. The six Boolean 4-tuples that are in analogical proportion. The Boolean
truth-values True and False are written as a cross and a blank.



Objects defined by Boolean properties Let us suppose that the objects
(or items) a, b, c, d are described by sets of binary properties belonging to a
set P rop. Then, each item can be viewed as a subset of P rop, made of the
attributes that hold true on this item3 . Then, we can apply Definition 1, namely
a ∩ b = c ∩ d and a ∩ b = c ∩ d. Another way of seeing this analogical proportion
is given by the equivalent definition:
Definition 2 Four objects (a, b, c, d) defined by binary properties are in analogi-
cal proportion iff the truth-values of each property on these objects make a 4-tuple
of binary values corresponding to one of the six 4-tuples displayed in Table 1.

Analogical dissimilarity We now introduce the concept of analogical dissim-
ilarity (AD) between four objects defined by binary attributes. It is simply the
sum on all the attributes of the analogical dissimilarity per attribute. The latter
is defined according to the following table:
                      a                   ××××××××
                      b           ××××            ××××
                      c       ××      ××      ××      ××
                      d     × × × × × × × ×
                     AD = 0 1 1 0 1 0 2 1 1 2 0 1 0 1 1 0
    AD per attribute is merely the minimal number of bit(s) that has/have to be
flipped in order to turn the four bits into an analogical proportion, according to
Table 1. Notice that any 4-tuple with a zero AD is an analogical proportion, and
vice-versa. For example, the four first objects of the formal context that we will
later call BASE lm (see Figure 2) are such that AD(leech, bream, frog, dog) =
0 + 1 + 0 + 0 + 0 + 0 + 0 + 1 + 1 = 3.

3      Searching for analogical proportions in formal concepts
This section is devoted to the following problem: given a formal context with
n objects and d properties (or attributes), is it possible to discover 4-tuples
of objects in analogical proportion, without running an O(n4 · d) algorithm?
We give in this section an heuristic algorithm which uses the lattice of formal
concepts, and has shown experimentally its efficiency for discovering analogical
proportions. We start with a brief reminder on FCA.
3
    For an object x, this subset is called R↑(x) in Formal Concept Analysis, see Sect. 3.1.
3.1   Formal concept analysis (FCA)

FCA starts with a binary relation R, called formal context, defined between a
set Obj of objects and a set P rop of Boolean properties. The notation (x, y) ∈ R
means that object x has property y. R↑ (x) = {y ∈ P rop|(x, y) ∈ R} is the set
of properties of object x. Similarly, R↓ (y) = {x ∈ Obj|(x, y) ∈ R} is the set of
objects having property y.
    Given a set Y of properties, one can define the set of objects [7]: R↓ (Y ) =
{x ∈ Obj|R↑ (x) ⊇ Y }.
    This is the set of objects sharing all properties in Y (and having maybe some
others). Then a formal concept is defined as a pair made of its extension X and
its intension Y such that

                           R↓ (Y ) = X and R↑ (X) = Y,

where (X, Y ) ⊆ Obj×P rop, and R↑ (X) is similarly defined as {y ∈ P rop|R↓ (y) ⊇
X}. It can be also shown that formal concepts are maximal pairs (X, Y ) (in the
sense of inclusion) such that X × Y ⊆ R.
   Moreover, the set of all formal concepts is equipped with a partial order (de-
noted ) defined as: (X1 , Y1 )  (X2 , Y2 ) iff X1 ⊆ X2 (or, equivalently, Y2 ⊆ Y1 ),
and forms a complete lattice, called the concept lattice of R.

    Let us consider an example where R is a relation that defines links be-
tween eight objects Obj = {1, 2, 3, 4, 5, 6, 7, 8} and nine properties P rop =
{a, b, c, d, e, f, g, h, i}. There is a “×” in the cell corresponding to an object x
and to a property y if the object x has property y, in other words the “×”s
describe the relation R (or context). An empty cell corresponds to the fact that
(x, y) 6∈ R, i.e., it is known that object x has not property y. The relation R
in the example is given in Figure 1. There are 5 formal concepts. For instance,
consider X = {a, b, c, d, e}. Then R↑∆ (X) = {7, 8} ; likewise if Y = {7, 8}. Then
R↓∆ (Y ) = {a, b, c, d, e}.


                                 a b c d e f g h i
                               1             ×
                               2             ×
                               3             ×××
                               4             ×××
                               5     ××××
                               6     ××××
                               7×××××
                               8×××××


         Fig. 1. R2 : a relation with 5 formal concepts (and 2 sub-contexts)
3.2   Organizing the search

The basic idea of our algorithm is to start from some 4-tuple of objects, to
observe the attributes that contribute to make AD non-zero for this 4-tuple,
and to replace one of the four objects by another object. Then we iterate the
process. Two important features have been added to avoid a random walk in the
space of the 4-tuples.

 1. The replacement of one object by another is done according to the obser-
    vation of the lattice of concepts. The idea is to try to decrease the value of
    AD. This point will be explained in the next section.
 2. All 4-tuples of objects that are created are stored in a list, ordered by in-
    creasing value of AD. The next 4-tuple to be chosen is the first in the list.

This algorithm can therefore be seen as an optimization procedure, more pre-
cisely as a best-first version of the GRAPHSEARCH algorithm ([11]). The or-
dered list of 4-tuples is an Open list in this interpretation.


3.3   Decreasing the analogical dissimilarity

Let us come now to the heart of the algorithm, namely the replacement of one
object by another. Can we find some information in the lattice that leads us to
choose both an object in the 4-tuple and another object to replace it ? Remember
that we are looking for a replacement that makes the AD decrease.
    Now, let us consider the following situation, taken from BASE lm (see Figure
2). Suppose that we are studying the 4-tuple of objects (3, 4, 9, 12), with AD = 1.
Attribute c is the only one to contribute in the AD of this 4-tuple. Actually, c
has the values (1, 1, 0, 1) on the 4-tuple (3, 4, 9, 12). We notice now that there
are two interesting concepts in the lattice with respect to c, the first one being
({b, c, h, g, a} , {3}) and the second being ({b, h, g, a} , {2, 3}), which are directly
connected. What can we deduce from this pair of connected concepts ?

 – Attribute c has value 1 on object 3.
 – Attribute c has value 0 on object 2.
 – Attribute c is the only attribute to switch from 1 to 0 to transform concept
   ({b, c, h, g, a} , {3}) into concept ({b, h, g, a} , {2, 3}).

We can conclude from this evidence that replacing object 3 by object 2 will
decrease by 1 the value of AD, since c will take values (0, 1, 0, 1) on the 4-tuple
(2, 4, 9, 12), and therefore that (2, 4, 9, 12) is an analogical proportion.
    Unfortunately this argument does not lead to a greedy algorithm, since there
no insurance, given a 4-tuple, that there exists a couple of concepts having
the three above properties. Most of the time, actually, there is more than one
attribute switching to 1 between two connected concepts, and only one insuring
the decreasing of AD.
    Let us take another example from the same data base. The 4-tuple (5, 4, 9, 12)
has a AD of 4, because of the attributes d, f , g and h. Two interesting connected
concepts are ({c, f, d, e, i, a} , {12}) and ({c, d, e, a} , {7, 12}), since f switching
from 1 to 0 will decrease the AD (see the table below). But replacing 12 by 7
in the 4-tuple (5, 4, 9, 12) will switch not only f but also the attribute i, and
we don’t know what will happen when switching i: it may decrease as well as
increase the AD. Actually, in this case, it increases the AD. Hence, the 4-tuple
(5, 4, 9, 7) has the same AD of 4.

         a b c d e f g h i                        a b c d e f g h i
       5×× × ×                                   5×× × ×
       4× ×          ×××                         4× ×         ×××
       9×× ×××                                   9×× ×××
      12 × × × × ×       ×                       7× ×××
      AD = 4                                     AD = 4

   By generalizing these examples, we propose now an heuristic to try to de-
crease the AD that we call h-Doap.
      Heuristic h-Doap. Let a couple of connected concepts be (A ∪ B, Z)
      and (B, Z ∪ Y ), A and B being subsets of attributes and Y and Z being
      subsets of objects such that A ∩ B = ∅ and Y ∩ Z = ∅. If there is a
      4-tuple with one of its four objects in Z and if there is an attribute in B
      that decreases the AD of this 4-tuple when switching from 1 to 0, then
      create a new 4-tuple by replacing the object in Z by an object in Y .

Next section shows how this heuristic can be used to discover 4-tuples of objects
with null AD in a formal context, i.e. analogical proportions of objects.


3.4     Algorithm
Discovering one analogical proportion. We explain in this section the al-
gorithm used to discover one analogical proportion in a formal context. We call
it ‘Discover One Analogical Proportion’, in short Doap. As already stated, it is
a simple version of Graphsearch, where the nodes to be explored are 4-tuples
of objects. We denote Start the 4-tuple of objects chosen to begin, Open the
current set of 4-tuples to be processed and Closed the set of 4-tuples already
processed.
    The choice of Start is either done randomly or by selecting objects which
appear in small subsets of objects in the lattice. We also require that the explored
4-tuples are composed of four different objects, since we do not want to converge
towards 4-tuples trivially in proportion, such as (1, 3, 1, 3) or even (2, 2, 2, 2).
    The algorithm stops either by discovering an analogical proportion, or in
failure. One has to notice that its failure does not insure that there is no ana-
logical proportion, since there is no guarantee given by the heuristic. We have
never met this failure case, but our experiments are very limited, as explained
in section 3.5.
       1: Algorithm Doap(Start)
       2: begin
       3: Closed ← ∅
       4: Open ← {Start}
       5: while Open 6= ∅ do
       6:   x ← the 4-tuple in Open having the lowest AD value
       7:   if AD(x) = 0 then
       8:      return x
       9:   else
      10:      Open ← Open \ {x} ; Closed ← Closed ∪ {x}
      11:      decision ← 1
      12:      while decision = 1 do
      13:        Use heuristic h-Doap to construct a new 4-tuple y from x
      14:        if y is composed of four different objects and y 6∈ Closed and y 6∈
                 Open then
      15:           Open ← Open ∪ {y} ; decision ← 0
      16:        end if
      17:      end while
      18:   end if
      19: end while
      20: return failure
      21: end


Discovering several analogical proportions. To discover more analogical
proportions, the simplest manner is to imbed algorithm Doap in a procedure
that discards the first two objects of a discovered analogical 4-tuple from the
formal context before re-running the algorithm. Since the transitivity holds for
analogical proportions on objects (u : v :: w : x and w : x :: y : z implies
u : v :: y : z), we are loosing no information on analogical 4-tuples. However,
we are not insured to find all proportions in that manner, due to the fact that
algorithm Doap may not find an existing proportion.


3.5     Experiments

The size of Close when the algorithm Doap stops is a precise indication of its
practical time complexity. Notice that a random algorithm, running on n objects
(without any construction of a formal lattice), in which there are q 4-tuples in
analogical proportion would in average try ((n4 )/8·q) 4-tuples before discovering
a proportion. In the previous formula, the number “8” comes from the fact that,
when there is one analogical proportion in a formal context, then there are in
fact exactly 8 through suitable permutations. This property stems directly from
the postulates an analogical proportion (see section 2).
    We have used two different formal contexts to run the algorithm Doap. The
first one is described in [2], except that we have added four objects 9, 10, 11 et 12
in order to have (at least) the analogical proportions (3, 4, 9, 10) and (1, 8, 11, 12).
This leads to the formal context called BASE lm , Figure 2.
                       a b c d e f g h i                  a b c d e f g h i
               leech 1 × ×         ×              bean 7 × × × ×
             bream 2 × ×           ××            maize 8 × × × ×
                frog 3 × × ×       ××                x 9×× ×××
                 dog 4 × ×         ×××               y 10 ×     ×××       ×
         spike-weed 5 × × × ×                        z 11 × ×     × × ×
                reed 6 × × × × ×                     t 12 × × × × ×       ×


Fig. 2. BASE lm : A formal context from Bělohlávek [2] increased with four objects
9, 10, 11 et 12 in order to have (at least) the analogical proportions (3, 4, 9, 10) and
(1, 8, 11, 12).



   The lattice constructed on this formal context (with the In-Close free soft-
ware [14]) has 31 concepts. We have run Doap more than 600 times. It has always
terminated by finding one of the three analogical proportions in the data4 . The
average size of the Closed list is 63 and its median size is 28. Figure 3 gives the
details.




Fig. 3. Results of 622 runs of Doap on BASE lm . The size of the Close list for each run
is on the Y axis.



    To appreciate these results, we have compared with a random search, replac-
ing line 13 of the Doap algorithm by picking a random 4-tuple. The detailed
results are given in Figure 4. The average size of the Closed list is 253 and its
median size is 174.
    We also have tried to “symmetrize” the role of 0 and 1 in this formal context
by adding the reverse attributes (indeed the Table 1 defining analogical propor-
tions is left unchanged when exchanging 0 and 1). It leads to 12 objects and 18
4
    We actually had a good surprise: Doap found a third proportion, namely (2, 4, 9, 12).
Fig. 4. Results of 630 runs of random Doap on BASE lm . The Y axis is graduated from 0
up to 2000.


attributes instead of 9. The size of the lattice of concepts is now 94. The algo-
rithm Doap with the same parameters examines in average 93 4-tuples before
finding an analogical proportion. The median value is 31. The symmetrization
does not seem to be a good idea in this case. The random Doap algorithm has
failed to give complete results on these data, due to overflows in the Close list.
    The second experiment has been run on the Lenses data base, from UCI ML
Repository [6]. The nominal attributes have been transformed into binary ones
by simply creating as many binary attributes as the number of modalities. The
number of objects is 24 with 7 binary parameters. The size of the lattice is 43, the
average number of examined 4-tuples is 77 and the median number is 20. When
adding the reverse attributes, we have a lattice of size 227, an average number
of 39 and a median number of 16. In that experiment, the symmetrization of the
data seems clearly to have a positive effect.
    A first conclusion is that our heuristic algorithm seems to perform well. In
the second context the basic search space has a size over 40.000 and we examine
only 77 4-tuples in average. The construction of the lattice of concepts takes
in practice much more time than the discovery of analogical proportions, which
seems to suggest that it is a relevant space for looking for analogical proportions.


4     Analogical proportions between formal concepts

We have seen that discovering analogical proportions in a formal context benefits
from the knowledge of the associated lattice of formal concepts. Then it raises
the question of understanding how formal concepts are involved in analogical
proportions. Clearly, four objects in the same formal concept form an analogical
proportion – in a trivial way – w.r.t. the subset of attributes involved in the
formal concept. Partial answers to the question, when two formal concepts are
involved in the proportion, are given in this section.


4.1   The smallest formal context in complete proportion

We are interested in this section in examining the properties of the smallest
context with an analogical proportion between objects. Obviously, this context
will have exactly four objects. If we want to have, only one time, each of the
possible analogical proportions between attributes, we need six of them (see table
1) and we obtain BASE 0 (see Figure 5).
                   f a b c d e                                       uv wx
                                                                     ∅
               u        ×××
               v      × ××
               w     × × ×                             wx       vx           uw   uv
                                                       a        b            c    d
               x     ××     ×

                     a b c d
                                                       u        v            w    x
                   u    ××                             cd       bd           ac   ab
                   v × ×
                   w× ×
                   x××
                                                                     ∅
                                                                     abcd



Fig. 5. BASE 0 , BASE 1 and the concepts lattice of BASE 1 . The lattice of BASE 0 is
deduced from it by adding e to all subsets of attributes.


    We can construct now the concept lattice of BASE 0 , but it is interesting to
get rid of attributes f (which will not be present in any context) and e (present
in every context). We call BASE 1 the reduced context, shown at Figure 5.
    Its lattice is displayed in Figure 5. Note that there is a perfect symmetry
between attributes and objects. The third line of the lattice expresses that u : v
:: w : x, but also in subsets terms that {c, d} : {b, d} :: {a, c} : {a, b}. The second
line expresses that a : b :: c : d and that {w, x} : {v, x} :: {u, w} : {u, v}. This
is not surprising: as explained in section 2, we can see an object as the set of
properties that hold true for it.

4.2     Some relations between analogical proportions and lattices of
        concepts
Firstly, let us remark that the two following propositions are equivalent. This is
immediate from section 2, in which these two equivalent definitions of analogical
proportion have been presented.
 1. x1 , x2 , x3 and x4 are four objects, in analogical proportion in this order.
 2. R↑ (x1 ), R↑ (x2 ), R↑ (x3 ) and R↑ (x4 ) are four subsets of properties in analog-
    ical proportion in this order.

Property 1 Let x1 , x2 , x3 and x4 be four objects in analogical proportion in this
order. Let (X1 , Y1 ) be the5 concept with the smallest set X1 of objects in which
x1 is present. Let us define (X2 , Y2 ), (X3 , Y3 ) and (X4 , Y4 ) in the same way.
Then the four sets of attributes Y1 , Y2 , Y3 and Y4 are in analogical proportion,
in this order.
      Proof. Since x1 ∈ X1 , all the attributes in Y1 take value 1 on x1 . Since X1 is
      the smallest set of objects including x1 , there is no attribute outside Y1 having
5
    If they were two, x1 would be present in the intersection of the two.
    value 1 on x1 . Hence, Y1 is exactly R↑ (x1 ), the extension of x1 , i.e. the subset
    of attributes that take value 1 on x1 . This is also true for x2 , x3 and x4 .
    We have to prove now that x1 : x2 :: x3 : x4 implies R↑ (x1 ) : R↑ (x2 ) :: R↑ (x3 )
    : R↑ (x4 ). It is immediate from the remark above.                                

For example, in BASE 0 , we know that 1 : 8 :: 11 : 12. We have X1 = {1, 2, 3, 11},
Y1 = {a, b, g}, X2 = {6, 8, 12}, Y2 = {a, c, d, f }, X3 = {11}, Y3 = {a, b, e, g, i}
and X4 = {12}, Y4 = {a, c, d, e, f, i}. The proportion Y1 :Y2 ::Y3 :Y4 holds, since:
{a, b, g}:{a, c, d, f }::{a, b, e, g, i}:{a, c, d, e, f, i}.
Property 2 Let (X1 , Y1 ), (X2 , Y2 ), (X3 , Y3 ) and (X4 , Y4 ) be four concepts of a
lattice of concepts, such that the four sets of attributes Y1 , Y2 , Y3 and Y4 are in
analogical proportion, in this order. Let X b1 be the subset of X1 composed of all
objects that are in X1 but cannot be found in any subset of X1 belonging to a
concept. We define in the same manner X     b2 , X
                                                 b3 and Xb4 . The following property
holds true: ∀x1 ∈ X1 , x2 ∈ X2 , x3 ∈ X3 , x4 ∈ X4 : x1 x2 :: x3 : x4 .
                    b        b         b          b
    Proof. It is the reciprocal of Property 1: Y1 is the extension of all objects in
    X
    b1 , and we take x1 in X b1 . We derive the conclusion from the remark above.
    

Property 3 Let x1 , x2 , x3 and x4 be four objects, in analogical proportion in
this order.
    Let A1111 = {y|y ∈ R↑ (x1 ), y ∈ R↑ (x2 ), y ∈ R↑ (x3 ), y ∈ R↑ (x4 ))}
    Let A1100 = {y|y ∈ R↑ (x1 ), y ∈ R↑ (x2 ), y 6∈ R↑ (x3 ), y 6∈ R↑ (x4 ))}
    Let A0011 = {y|y 6∈ R↑ (x1 ), y 6∈ R↑ (x2 ), y ∈ R↑ (x3 ), y ∈ R↑ (x4 ))}
    Let A1010 = {y|y ∈ R↑ (x1 ), y 6∈ R↑ (x2 ), y ∈ R↑ (x3 ), y 6∈ R↑ (x4 ))}
    Let A0101 = {y|y 6∈ R↑ (x1 ), y ∈ R↑ (x2 ), y 6∈ R↑ (x3 ), y ∈ R↑ (x4 ))}
    Then
    ({x1 , x2 }, A1111 ∪ A1100 ) is included into a formal concept.
    ({x3 , x4 }, A1111 ∪ A0011 ) is included into a formal concept.
    ({x1 , x3 }, A1111 ∪ A1010 ) is included into a formal concept.
    ({x2 , x4 }, A1111 ∪ A0101 ) is included into a formal concept.


The result follows from the definition of the subsets of attributes considered and
their clear relation with the definition of analogical proportions. The fact that
we only have an inclusion in the above property should not come as a surprise.
Indeed, when describing objects, attributes that are nor not relevant w.r.t. the
analogical proportion may be present.


5    Lines for further research and concluding remarks
Beyond the already introduced set function, R↓ (Y ) = {x ∈ Obj|R↑ (x) ⊇ Y },
which is at the core of FCA,and which leads to the definition of formal concepts,
it has been noticed [5], on the basis of a parallel with possibility theory that,
given a set Y of properties, four remarkable sets of objects can be defined in this
setting (here the overbar denotes set complementation):
 – R↓Π (Y ) = {x ∈ Obj|R↑ (x) ∩ Y 6= ∅} = ∪y∈Y R↓ (y). This is the set of objects
   having at least one property in Y .
 – R↓N (Y ) = {x ∈ Obj|R↑ (x) ⊆ Y } = ∩y6∈Y R↓ (y). This is the set of objects
   having no property outside Y .
 – R↓∆ (Y ) = R↓ (Y ) = ∩y∈Y R↓ (y). This is the set of objects sharing all prop-
   erties in Y .
 – R↓∇ (Y ) = {x ∈ Obj|R↑ (x) ∪ Y 6= Obj} = ∪y6∈Y R↓ (y). This is the set of
   objects that are missing at least one property outside Y .


   It has been recently pointed out [3] that pairs (X, Y ) such that R↓N (Y ) = X
and R↑N (X) = Y are characterizing independent sub-contexts (X, Y ) such that
((X × Y ) + (X × Y ) ⊇ R, in the sense that they do not share any object or
property. Thus, in Figure 1, ({a, b, c, d, e, f }, {5, 6, 7, 8}) and ({g, h, i}, {1, 2, 3, 4})
are two formal sub-contexts.
    When comparing the features underlying FCA and analogical proportions,
one can notice that the same 4 “indicators” are involved from the beginning: a∩b,
a ∩ b, a ∩ b, and a ∩ b. Indeed R↓Π (Y ) is based on the condition R↑ (x) ∩ Y 6= ∅,
R↓N (Y ) on the condition R↑ (x)∩Y = ∅, R↓∆ (Y ) on the condition R↑ (x)∩Y = ∅,
and R↓∇ (Y ) on the condition R↑ (x) ∩ Y 6= ∅. Moreover, with these 4 indicators,
one can define other so-called logical proportions [4], including some that are
closely related to analogical proportions such as ‘paralogy’ which reads “what
a and b have in common, c and d have it also” and is defined by a ∧ b =
c ∧ d and a ∧ b = c ∧ d [10]. This more generally raises the question of the rela-
tions between FCA and these logical proportions.

    Finally, the experiments with Doap have obviously to be scaled on larger
formal contexts, in order to estimate its practical complexity more accurately.
Some more thought has also to be given about the choice of the Start 4-tuples,
especially to take advantage of the addition of the reverse attributes. An inter-
esting point would be to be able to choose the Start in order to insure that every
analogical proportion can be discovered. We also believe that the speed of Doap
can be increased, since there are still a lot of parameters to tune, for example
breaking ties in the head of the Close list in a non random fashion.
   An interesting question is whether or not the construction of the lattice must
precede the heuristic search. It would certainly be of great interest to construct
only the parts that are required by the running od the Doap algorithm. This
would lead to merge the two parts of the method, rather than computing the
whole lattice (a very costly operation) before its exploration.
    More generally, it would be clearly of interest to have an algorithm also able
to find out the analogical proportions that hold in some sub-context (since as
already said, irrelevant attributes may hide interesting analogical proportions),
rather than in the initial formal context. This will open a machine learning point
of view [1].
6    Aknowledgements

We would like to thank the anonymous reviewers for their careful reading of this
article and their interesting suggestions.


References
 1. S. Bayoudh, L. Miclet, and A. Delhay. Learning by analogy: a classification rule for
    binary and nominal data. Proc. 20th Inter. Joint Conf. on Artificial Intelligence,
    (M. M. Veloso, ed.), Hyderabad, India, AAAI Press, 678–683, 2007.
 2. R. Bělohlávek. Introduction to formal context analysis. Internal report. Dept of
    Computer science. Palacký University, Olomouk, Czech Republic. 2008.
 3. Y. Djouadi, D. Dubois, H. Prade. Possibility theory and formal concept analysis:
    Context decomposition and uncertainty handling. Proc. 13th Inter. Conf. on Infor-
    mation Processing and Management of Uncertainty (IPMU’10), (E. Hüllermeier,
    R. Kruse and F. Hoffmann, eds.), Dortmund, Springer, LNCS 6178, 260–269, 2010.
 4. H. Prade, G. Richard. Logical proportions - Typology and roadmap. Proc. Inter.
    Conf. on Information Processing and Management of Uncertainty in Knowledge-
    based Systems (IPMU 2010), Dortmund, (E. Hüllermeier, R. Kruse, F. Hoffmann,
    eds.), Springer, LNCS 6178, 757–767, 2010.
 5. D. Dubois, F. Dupin de Saint-Cyr, H. Prade. A possibility-theoretic view of formal
    concept analysis. Fundamentae Informaticae, 75, 195–213, 2007.
 6. A. Frank and A. Asuncion. (2010). UCI Machine Learning Repository
    [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of In-
    formation and Computer Science.
 7. B. Ganter and R. Wille. Formal Concept Analysis. Mathematical Foundations.
    Springer Verlag, 1999.
 8. Y. Lepage. De l’analogie rendant compte de la commutation en linguistique.
    http://www.slt.atr.jp/ lepage/pdf/dhdryl.pdf, Grenoble, 2001. HDR.
 9. L. Miclet and H. Prade. Handling analogical proportions in classical logic and fuzzy
    logics settings. Proc. 10th Europ. Conf. on (ECSQARU’09), Verona, Springer,
    LNCS 5590, 2009, 638–650.
10. H. Prade and G. Richard. Analogy, paralogy and reverse analogy: Postulates and
    Inferences. Proc. Annual German Conf. on Artificial Intelligence (KI 2009), Pader-
    norn, Sept. 15-18, (B. Mertsching, M. Hund, Z. Aziz, eds.), Springer, LNAI 5803,
    306–314, 2009.
11. N. Nilsson. Principles of Artificial Intelligence. Tioga, 1980.
12. N. Stroppa and F. Yvon. Analogical learning and formal proportions:
    Definitions and methodological issues. Technical Report ENST-2005-D004,
    http://www.tsi.enst.fr/publications/enst/techreport-2007-6830.pdf, June 2005.
13. R. Wille. Restructuring lattice theory: an approach based on hierarchies of con-
    cepts. In: Ordered Sets, (I. Rival, ed.), D. Reidel, Dordrecht, 445–470, 1982.
14. The Inclose software. http://inclose.sourceforge.net/. Downloaded on March 2011.