<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Efficient algorithms for clone items detection Efficient algorithms for clone items detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Raoul Medina</string-name>
          <email>medina@isima.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Caroline Noyer</string-name>
          <email>C@liesrimmoan.tf-rFerrand</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olivier Raynaud Raoul Medina</string-name>
          <email>raynaud@isima.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Caroline Noyer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olivier Raynaud</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LIMOS - Universit ́e Blaise Pascal</institution>
          ,
          <addr-line>Campus univeLrsIiMtaOirSe -deUsnCiv ́eezresaietu ́ xB,laCisleermPaosncta-lF,errand</addr-line>
          ,
          <country>France Campus univers</country>
        </aff>
      </contrib-group>
      <fpage>70</fpage>
      <lpage>81</lpage>
      <abstract>
        <p>This paper presents efficient algorithms for clone items detection in a binary relation. Best implementation of these algorithms has O(|J|.||M||) time complexity, with J corresponding to the set of items of the relation and ||M|| corresponding to the size of the relation. This result improves the previous algorithm given in [3] which is O(|J|2.||M||). Clone items have been introduced by Medina and Nourine to explain why, sometimes, the number of rules of a minimum cover of a relation is exponential with the number of items of the relation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>then to be able to compute the clone classes of the (potentially exponential)
collection without generating its sets. To solve this problem in polynomial time,
the general method could be as follows:
– Let M be the (potentially exponential) collection of sets verifying a property
over a given binary relation R. Compute in polynomial time a collection M0
such that:
1. the size of M0 is polynomial in the size of R,
2. items a and b are clone in M if and only if they are clone items in M0.
– Detect the clone classes in M0.</p>
      <p>
        Our paper focuses on the detection phase of this approach. The clone classes
computation algorithm given in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] has an O(|J |2.||M||) time complexity, where
J is the set of items and ||M|| is the size of the input collection. In other words,
||M|| = Pm∈M |m|. In this paper, we present different algorithms to solve the
clone classes computation. The best complexity we obtain is in O(|J |.||M||).
      </p>
      <p>
        This paper is organized as follows: section 2 formally defines the problem in
terms of collection of sets and introduces the corresponding Abstract Data Type
which will be used by our algorithms. Section 3 describes three computation
strategies and the corresponding time complexities are studied. Section 4 shows
how to take advantage of those algorithms in order to compute the clone items
classes as defined in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>General context and de nitions</title>
      <p>In this section, we first formally define the studied problem. Then we present
the Abstract Data Structure called Map used in our algorithms and discuss on
its possible implementations.
2.1</p>
      <sec id="sec-2-1">
        <title>Clone items in a Sets Collection</title>
        <p>Let J be a set of items {x1, ..., x|J|} and M a sets collection on J . We denote
by ϕa,b : 2J → 2J the mapping which associates to any subset of J its image
by swapping items a and b. More formally :</p>
        <p> (m \ {a}) ∪ {b} if b 6∈ m and a ∈ m
ϕa,b(m) =  (m \ {b}) ∪ {a} if a 6∈ m and b ∈ m</p>
        <p> m otherwise
Definition 1. Let M be a collection of sets defined on J . We say that items a
and b are clone items in M if and only if ∀m ∈ M, ϕa,b(m) ∈ M.</p>
        <p>The clone items concept is a binary one. To the question ”are a and b clone
items ?”, only the answer ”true” or ”false” is possible. It could be interesting
to have more precisions when the negative response is given. Are a and b very
far from being clone items or why are not they clone ? For this purpose, we
introduce a measure to qualify the clone property. This measure will represent
a distance between two items a and b, based on the definition of the clone items
property. This distance is exactly the number of elements m of M which do not
have an image in M when applying the swapping function ϕa,b(m). When this
distance is zero, a and b are clone items. The greater the distance is, the farther
a and b are to be clone. More formally:
Definition 2. Let M be a sets collection on J and let (a, b) in J 2. We call
distance between a and b, denoted by dM(a, b), the mapping :</p>
        <p>J 2 → N
dM(a, b) → {m ∈ M | ϕa,b(m) 6∈ M}</p>
        <p>Thanks to definition 2, clone items could be charaterized as follows :
Proposition 1. Let M be a sets collection on J and (a, b) in J 2, a and b are
clone items if and only if dM(a, b) = 0.</p>
        <p>The problem we study in this paper is the computation of distances between
all pair of items of J :
Problem 1 (Distance).</p>
        <p>Data : a sets collection M on J ;
Result : the matrix dM.</p>
        <p>Here, we present the main property on which rely our algorithms. It
characterizes a couple (m, m0) of the sets collection M such that m = ϕa,b(m0).
Proposition 2. Let M be a sets collection defined over J , m and m0 two
distinct sets of M and (a, b) two items of J such that a ∈ m and b ∈ m0. Then the
following assertions are equivalent:
1. m = ϕa,b(m0)
2. m0 = ϕa,b(m)
3. |m| = |m0| and m \ m0 = {a} and m0 \ m = {b}
4. m \ {a} = m0 \ {b}</p>
        <p>This property states that two sets m and m0 are their respective images by
the swapping function ϕ if and only if they have same size t and share t − 1
items. This property follows directly from the definition of the ϕ mapping.
2.2
Interface. We use a Map abstract data type similar to the Map interface of
Java language. This data structure maps keys to values. In our case, the keys
are the sets of the collection. The values mapped by the keys depend on the
algorithm.</p>
        <p>This abstract data type supplies the following operators:
– new() operator: creates a Map object and returns an empty map.
corresponds exactly to the set m.
respects the order defined by &lt;J ;
b</p>
        <p>d
{a,d} %%
a value, or Nil otherwise.
– get(e) operator: returns the value associated to the key e if this key maps
– put(e,value) operator: inserts set e in the map and associates value to it.</p>
        <p>Efficient algorithms for clone items detection
73</p>
        <p>Time complexities of those operators deeply rely on the data structure used
for the implementation of the Map data type. We propose an implementation
which takes advantage of the type of the keys, i.e. sets. To implement the Map
type we propose a lexicographic tree: itemsets are represented by branches of</p>
      </sec>
      <sec id="sec-2-2">
        <title>Implementation: lexicographic tree. We first give a formal definition of a</title>
        <p>lexicographic tree corresponding to a sets collection.</p>
        <p>Definition 3. Let M be a sets collection defined over J , with a total order on
J denoted by &lt;J . A unique lexicographic tree is associated to M such that:
– Each edge of the tree is labeled with an element of J ;
– To each marked node of the tree corresponds a set of M;
– To each set m of M correspond a unique path in the tree (starting from root
and ending with a marked node) such that the union of labels in this path
– For any path from the root to any node, the order of the successive labels
– The order of edges leaving a node respects the order defined by &lt;J .
{a,b,c}
node is marked (i.e. corresponds to a set of the collection and thus to a key) or
not. Any field type can be associated to the node for storing the value associated
to the key (on marked nodes only). There are two ways of implementing the list
of children of a node in the lexicographic tree:
– Using a List structure, each entry of the list containing the label of the
associated edge and a reference to the child;
– Using an Array structure being indexed by the labels and the entries
containing either NIL or reference to the child.</p>
        <p>Complexities of the put and get operators rely on the chosen implementation.
Proposition 3. If the lexicographic tree is implemented with Lists then:
– The put(m,value) operator as an O(|J |) time complexity;
– The get(m) operator as an O(|J |) time complexity;</p>
        <p>Access complexity is due to the fact that an item appears only once in a set.
Cost of a node creation is done in constant time.</p>
        <p>Proposition 4. If the lexicographic tree is implemented with Arrays then:
– The put(m,value) operator as an O(|m| × |J |) time complexity;
– The get(m) operator as an O(|m|) time complexity;</p>
        <p>Acces complexity is due to the fact that we have direct access to a child
labeled by a given item. Creation of a node of the tree costs |J | since we need
to create and initialize a |J | sized array.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Strategies and algorithms</title>
      <p>In this section, we study three different strategies for solving the Distance
problem. All strategies rely on the same basic idea: first the distance matrix is
initialized with the maximum possible values for each distance dM(a, b). Then,
each time the algorithm finds m and m0 such that m = ϕa,b(m0), the distance
dM(a, b) is decreased by one. Main difference between the three strategies is the
way they detect that m = ϕa,b(m0).</p>
      <p>
        The first strategy is the one given in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We call it set existence checking.
Main idea of the algorithm is, given m ∈ M, to compute all possible sets ϕa,b(m)
and then check if these sets belong to the collection. Second strategy is called
ϕa,b relation checking. Its principle is, for each pair (m, m0), to check if there
exist items a and b such that m = ϕa,b(m0). This test is done using Property 2.
Third strategy is called classes computation. Based on Property 2, it computes
classes C of itemsets such that for any pair (m, m0) of C, there exist items a and
b such that m = ϕa,b(m0).
      </p>
      <p>We first discuss about the initialization of the distance matrix since this is a
common ground for all strategies.</p>
      <sec id="sec-3-1">
        <title>Discussion on the distance matrix</title>
        <p>Since the notion of distance dM(a, b) is a symetrical one, the distance matrix
is also symmetrical. We thus choose to represent it by a triangular matrix such
that:
dM(a, a) = 0
dM(a, b) = dM(b, a)</p>
        <p>In this section we discuss the different strategies for the initialization and the
update of the distance matrix, as well as their respective costs.</p>
        <p>Let M = Mab + Mab + Mab + Mab, with:
- Mab: the sets m of M such that a ∈ m and b 6∈ m
- Mab: the sets m of M such that a 6∈ m and b ∈ m
- Mab: the sets m of M such that a ∈ m and b ∈ m
- Mab: the sets m of M such that a 6∈ m and b 6∈ m.</p>
        <p>Incrementation or decrementation strategy ? There are two ways of
computing the distance matrix:
– Either by first initializing all distances to 0 and then by increasing by 1 the
distance dM(a, b) each time we find m ∈ M such that ϕa,b(m) 6∈ M,
– or by first initializing the distances by a maximal value and then decreasing
by 1 the distance dM(a, b) each time we find m and m0 ∈ M such that
m = ϕa,b(m0).</p>
        <p>What is the maximal value that can be taken by dM(a, b) ? Clearly the answer
is |Mab| + |Mab|. Indeed, for any item m ∈ Mab ∪ Mab we have m = ϕa,b(m)
and thus m cannot increase the distance between a and b. This represents the
maximum number of basic operations (either incrementation or decrementation)
needed to compute the distance matrix in the worst case. Thus, whatever
strategy we choose, the number of basic operations will be the same in the worst
case.</p>
        <p>What is the best time complexity for both strategies ? We consider that the
basic operations can be done in O(1). Thus the overall complexity will be:
O(X X |Mab| + |Mab|).</p>
        <p>a∈J b∈J
Now, consider m ∈ M. In the worst case, m will be taken into account in at
most |m| × |J \ m| distances dM(a, b). Indeed, for m to be taken into account in
dM(a, b) either a or b belongs to m but not both. Thus, we have:
O(X X |Mab| + |Mab|) = O( X |m| × |J \ m|).</p>
        <p>a∈J b∈J m∈M
This can be rewritten as follows:</p>
        <p>O( X |m| × (|J | − |m|)) = O(( X |m| × |J |) − ( X |m| × |m|)).</p>
        <p>m∈M m∈M m∈M
And since |m| × |m| is lesser or equal than |m| × |J |, we obtain the worst case
complexity:</p>
        <p>O( X |m| × |J |) = O(|J | × ||M||).</p>
        <p>m∈M</p>
        <p>Thus, whatever update strategy is chosen, time complexity cannot be less
than O(|J | × ||M||) (upon the hypothesis that the basic operations increment
or decrement by 1).</p>
        <p>In this paper we choose to adopt the decrementation strategy since our
algorithms are based on Property 2. We now discuss the complexity of the
initialization of the distance matrix.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Initializing the distance matrix. We initialize each distance dM(a, b) with</title>
        <p>the maximal value possible, i.e. with |Mab| + |Mab|. Initially, the distances are
equal to 0. This can be done in O(|J | × |J |).</p>
        <p>Then, for each m ∈ M we increment by 1 the distances dM(a, b), with a ∈ m
and b 6∈ m. As shown in previous subsection, this can be done in O(|J | × ||M||).</p>
        <sec id="sec-3-2-1">
          <title>Algorithm 1: InitDistance(M)</title>
          <p>Data : A sets collection M defined over J.</p>
          <p>Result : The distance matrix dM such that for all a and b in J 2 we have
dM(a, b) = |Mab| + |Mab|.
begin
foreach a ∈ J do
foreach b ∈ J do</p>
          <p>dM(a, b) = 0;
foreach m ∈ M do
foreach a ∈ m do
foreach b 6∈ m do</p>
          <p>dM(a, b) + +;
return dM;
end
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Set Existence Checking Strategy</title>
        <p>For each set m of M and for all pair (a, b) of J 2, we check if ϕa,b(m) belongs to
M. In order to check the existence of ϕa,b(m), we choose to store the collection
M in the Map structure presented before. The sets of M are the keys while
their associated value is "present".</p>
        <p>Beware that since a distance dM(a, b) is initialized with |Mab| + |Mab|, this
distance should be decremented only when m 6= ϕa,b(m). Indeed, if m = ϕa,b(m)
then m 6∈ Mab ∪ Mab.
Algorithm 2: ComputeDistance(M) : Set Existence Checking Strategy
Data : A sets collection M defined over J.</p>
        <p>Result : The distance matrix dM.
begin
dM = InitDistance(M);
T = new Map();
foreach m ∈ M do</p>
        <p>T .put(m,”present”);
end
Proposition 5. The Set Existence Checking Strategy has an O(|J |2 × ||M||)
time complexity.</p>
        <p>Proof. Initialization of the matrix is done in O(|J | × ||M||). We suppose that
the T Map structure is implemented using Arrays. Thus, the initialization of
T is done in O(Pm∈M |m| × |J |) = O(|J | × ||M||). Loop on line 1 does |M|
iterations while loop of line 2 does |J |2 iterations. The ϕa,b(m) operation as well
as the test m 6= ϕa,b(m) and the get(m) operation can be done in O(|m|) time
complexity. Overall complexity is thus, O(Pm∈M |J |2 × |m|) = O(|J |2 × ||M||).</p>
        <p>
          This strategy is a slighter improvement of the one presented in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Note that
it can be improved a little more by choosing a in m and b in J \ m.
3.3
        </p>
        <p>ϕa,b Relation Checking Strategy
For any pair of elements (m, m0) of M, we check if there exist a and b such
that m = ϕa,b(m0). According to Property 2, items a and b exist if and only
if m and m0 have same size and share |m| − 1 items. In this case, dM(a, b) is
decremented by 1.</p>
        <p>Proposition 6. The ϕa,b relation checking strategy has an O((|J | + |M|) ×
||M||) time complexity.</p>
        <p>Proof. Initialisation of the matrix is done in O(|J | × ||M||). External loop (line
1) does |M|2 iterations. All tests of line 2 can be done in O(|m|) provided that
the sets m and m0 are stored sorted according to &lt;J . The complexity of the loop
is in O(|M| × ||M||). Overall complexity is thus in O((|J | + |M|) × ||M||).
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Classes Computation Strategy</title>
        <p>This strategy also relies on Property 2. Let consider m, m0 and m00 be sets of
M such that m = ϕa,b(m0) and m = ϕa,c(m00). Then, according to Property
Algorithm 3: ComputeDistance(M): ϕa,b Relation Checking Strategy
Data : A set collections M defined over J.</p>
        <p>Result : The distance matrix dM.
begin</p>
        <p>dM = InitDistance(M);
1 foreach (m, m0) ∈ M2 do
2 if |m| = |m0| and |m \ m0| = 1 and |m0 \ m| = 1 then
dM(m \ m0, m \ m) − −;</p>
        <p>0
end
2 we have m \ {a} = m0 \ {b} = m00 \ {c}. And thus, m0 = ϕb,c(m00). Idea of
the algorithm is to compute classes of sets mi of M having |mi| − 1 common
items. Thus, a class C can be represented by the set of common items and we
memorize in a set Union all the extra items xi which are not common. In the
Map structure we use, the set C will be the key while the set Union will be the
value associated to the key.</p>
        <p>Then, for any m ∈ C and for any (a, b) ∈ Union, we know that according to
Property 2 we have ϕa,b(m) ∈ M and m 6= ϕa,b(m). And thus, dM(a, b) has to
be decremented. Note that a set m can belong to at most |m| classes.</p>
        <p>The algorithm is quite straightforward using the Map structure. For all sets
m of M we insert each of its |m| subsets of size |m| − 1 in the Map structure.
If the key was already present, we just append the extra item of m to the set
Union and update all the necessary entries in the distance matrix. Otherwise, a
new key m \ {x} is present in the Map structure and its associated Union value
is initialized with {x}.</p>
        <p>Proposition 7. Classes computation strategy has O(|J | × ||M||) time
complexity.</p>
        <p>Proof. Initialization of the distance matrix is done in O(|J |×||M||). We suppose
that the Map structure is implemented using a lexicographic tree with Lists. The
line 1 loop does |M| iterations. Loop in line 2 does |m| iterations. In line 3, the
retrieval is done in O(|J |). The loop in line 4 does at most |J | iterations and
the update of the matrix takes constant time. The insertion of line 5 is done in
O(|J |). Thus, the overall complexity is in O(Pm∈M |m| × |J |) = O(|J | × ||M||).</p>
        <p>Let us illustrate Algorithm 4 with an example. The considered collection
is M = {m1 = {a, e, f, h}, m2 = {b, e, f, h}, m3 = {b, d, f, h}}. The resulting
lexicographic tree obtained with Algorithm 4 is shown in Figure 2.</p>
        <p>From this tree, we conclude that m2 and m3 share the items {b, f, h} and that
m2 = ϕd,e(m3). Thus, dM(d, e) should be decremented. For the same reason,
the distance dM(a, b) should also be decremented since m1 = ϕa,b(m2) (they
share the items {e, f, h}).
Algorithm 4: ComputeDistance(M): Classes Computation Strategy
Data : A set collections M defined over J.</p>
        <p>Result : The distance matrix dM.
begin
dM = InitDistance(M);</p>
        <p>T = new Map();
1 foreach m ∈ M do
2 foreach x ∈ m do</p>
        <p>C = m \ {x}
3 Union = T .get(C)</p>
        <p>if Union 6= Nil then
4 foreach y ∈Union do</p>
        <p>dM(x, y) − −;
Union = Union ∪{x};</p>
        <p>T .put(C, U nion);
else
5</p>
        <p>T .put(C, {x})
end
3.5</p>
      </sec>
      <sec id="sec-3-5">
        <title>Memory usage</title>
        <p>In this section we discuss the space complexity required by each strategy and
show how to reduce the memory usage when possible.</p>
        <p>First, it is obvious that the distance matrix should be present in memory. It
requires O(|J |2) memory space.</p>
        <p>Concerning the Set Existence Checking Strategy, a lexicographic tree
implemented with arrays is used. The number of nodes is clearly bounded by ||M||.
And since each node requires a |J | sized array, this strategy uses O(|J | × ||M||)
memory space.</p>
        <p>For the ϕa,b Relation Checking Strategy, no lexicographic tree is used.
However, the collection M needs to be present in memory. Thus, this strategy
requires O(||M||) memory space.</p>
        <p>Now, for the Classes Computation Strategy, the used lexicographic tree is
implemented using lists. Such implementation requires O(||M||) memory space.
But M is not the collection stored in the tree. Indeed, for each m ∈ M, we store
its |m| subsets of size |m| − 1. And since |m| is bounded by |J | we conclude that
this strategy requires O(|J | × ||M||) memory space.</p>
        <p>The conclusion seems to be that whatever strategy is used, at least O(||M||)
memory space will be needed. But one can notice that, according to Property
2, not all sets in M need to be present in memory at the same time. Indeed,
if m = ϕa,b(m0), then m and m0 have same size. The idea is then to do a
partitionning of M according to the size of the sets. Thus, only sets of same
size need to be present in memory at the same time. Computations done for
f
h
e
h
f</p>
        <p>f
(h) (f)
(e)
(h)
(f)</p>
        <p>(h)
a
d
h
f
e
b</p>
        <p>d
f
h
(f)
e
f
h
(d,e)
h
(b)
f</p>
        <p>h
(a,b)
those sets is totally independent of computations done for sets with a different
size. Note that the partitionning of M can be done in O(||M||) time complexity,
with a single scan of the collection. Thus, the partitionning does not change the
overall complexity of the different strategies.</p>
        <p>Another remark is that since computations by size are totally independent,
one could easily implement a distributed version of the strategies. This could
eventually speed up the computation of the distance matrix. But note that in this
case, the distance matrix should also be distributed. Best solution should be to
initialize local distance matrices with the local partition. After local computation
(for sets with same size), all the distance matrices should be merged together
(by adding all the obtained values) in order to obtain the global distance matrix.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Clone classes computation</title>
      <p>
        The problem stated in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is the computation of clone classes. We thus
give the algorithm which computes clone classes from a distance matrix.
      </p>
      <p>First, let us recall that two items a and b are clone if and only if dM(a, b) =
0. Since, the clone relation is an equivalence relation, it defines a partition of the
set J .</p>
      <p>Principle of our algorithm 5 is the following. Let a ∈ J be an item which has
still not been assigned to a class. We then search all remaining items b which
distance with a is null. All those items will form a clone class with a and thus
are removed from the list of items which are not assigned to a class. The class
of a is then stored in a list L. Total complexity of the clone classes computation
is in O(|J | × ||M||) time and space complexity.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We have seen that if the problem is the computation of the distance matrix and
only basic incrementation or decrementation by 1 are allowed, then the minimal</p>
      <sec id="sec-5-1">
        <title>Algorithm 5: ComputeCloneClasses(M)</title>
        <p>Data : A sets collection M defined over J.</p>
        <p>Result : The list L of clone classes.
begin
dM = ComputeDistance(M);
L = ∅; temp = J;
while temp 6= ∅ do
1 foreach a ∈ temp do</p>
        <p>la = newList(); la = la ∪ {a}; temp = temp \ {a};
2 foreach b ∈ temp do
3 if dM(a, b) = 0 then
4 la = la ∪ {b};
5 temp = temp \ {b};</p>
        <p>L = L + la;
return L;
end
time complexity for the update of the matrix is in O(|J | × ||M||). Thus, under
this hypothesis, our algorithm is optimal. Figure 3 gives the different time and
space complexities obtained with our algorithms.</p>
        <p>An open question is to know whether or not the distance matrix could be
incremented (or decremented) by more than 1 at each step. This could be the
only way of improving our algorithm. Another open question is to know if there
is a more efficient algorithm to compute the clone classes (for instance without
computing the distance matrix). Those are two questions we are investigating.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>A.</given-names>
            <surname>Gely</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Medina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Nourine</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Renaud</surname>
          </string-name>
          .
          <article-title>Uncovering and reducing hidden combinatorics in guigues-duquenne covers</article-title>
          .
          <source>In ICFCA'05</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>H.</given-names>
            <surname>Manilla and K.J. Ra</surname>
          </string-name>
          <article-title>¨hia¨. On the complexity of inferring functionnal dependencies</article-title>
          .
          <source>Discret Applied Mathematics</source>
          ,
          <volume>40</volume>
          (
          <issue>2</issue>
          ):
          <fpage>237</fpage>
          -
          <lpage>243</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>R.</given-names>
            <surname>Medina</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Nourine</surname>
          </string-name>
          .
          <article-title>Clone items: a pre-processing information for knowledge discovery</article-title>
          . submitted.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>