<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Restrictions on Concept Lattices for Pattern Management</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Le´onard Kwuida</string-name>
          <email>kwuida@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rokia Missaoui</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Beligh Ben Amor</string-name>
          <email>benb03g@uqo.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lahcen Boumedjout</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jean Vaillancourt</string-name>
          <email>jean.vaillancourtg@uqo.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universite ́ du Que ́bec en Outaouais</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Zurich University of Applied Sciences</institution>
        </aff>
      </contrib-group>
      <fpage>235</fpage>
      <lpage>246</lpage>
      <abstract>
        <p>This paper addresses the problem of pattern management in the framework of formal concept analysis using restrictions on objects or attributes of a given data set. Patterns are pieces of information/knowledge with a concise description that can be obtained from data using data mining techniques. These can be clusters or bi-clusters, implications or association rules, and so on. Even for relatively small data collections, the set of discovered patterns could be very large and therefore difficult to handle. In this paper we propose efficient methods to conduct the projection of a concept lattice on a set of attributes. The selection of a concept lattice on a set of objects is done dually.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Pattern discovery and management refers to a set of activities related to the extraction,
description, manipulation and storage of patterns in a similar (but more elaborated)
way as data are managed by database applications. In pattern management and
inductive databases [
        <xref ref-type="bibr" rid="ref1 ref6 ref7 ref9">1, 6, 9, 7</xref>
        ], patterns are knowledge artifacts (e.g., association rules,
clusters) extracted from data using data mining procedures (generally run in advance), and
retrieved upon user’s request. A pattern is then a concise and semantically rich
representation of raw data. An example of a pattern could be a cluster that represents a set
of transactions with the common bought items or an association rule which states that
whenever we buy cheese we tend to buy crackers too.
      </p>
      <p>In many information system applications, users tend to be drowning in data and
even in patterns while they are actually interested in a very limited set of knowledge
pieces. Moreover, the scope of patterns to explore differs from one user to another and
changes over time. Finally, one is frequently interested in an exploratory and iterative
process of data mining (DM) to discover patterns under different scenarios and different
hypotheses. To that end, we propose to define two main algebraic operators : selection
and projection on concept lattices.</p>
      <p>In this paper we handle the problem of pattern management by describing two
algebraic operations commonly used in relational queries, namely projection and selection.
These operations will be applied to data sets (tables or formal contexts) and concept
sets/lattices. The problem of projection can be stated as follows:
Given a set of patterns P and a subset N of attributes in a table T , produce the
corresponding set of patterns when the analysis is restricted to the attributes in
N .</p>
      <p>Dually, the problem of selection is: given a set of patterns P and a formula ' of T ,
produce the corresponding set of patterns when the analysis is restricted to those tuples
satisfying the condition '.</p>
      <p>The rest of the paper is organized as follows. In Section 2 we introduce the basic
notions of formal concept analysis and describe existing work about the general topic of
pattern management. Sections 3 and 4 present different ways to conduct a projection on
a set of attributes, namely projection on contexts and concept lattices. An experimental
study is given in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background and Related Work</title>
      <p>
        Formal Concept Analysis [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] has been successfully used for conceptual clustering and
rule mining. A formal context is a triple K := (G; M; I) where G, M and I stand for a
set of objects, a set of attributes, and a binary relation between G and M respectively.
For A G and B M we define
      </p>
      <p>A0 := fa 2 M j oIa 8o 2 Ag
and</p>
      <p>
        B0 := fo 2 G j oIa 8a 2 Bg;
the set of attributes common to objects in A and the set of objects sharing all the
attributes in B. The mapping (denoted by 0) between the powerset of G and the
powerset of M defines a Galois connection, and the induced closure operators (on G and
M ) are denoted by 00. A formal concept c is a pair (A; B) with A G, B M ,
A = B0 and B = A0. A is called the extent of c (denoted by ext(c)) and B its intent
(denoted by int(c)). Intents are then closed subsets of attributes and extents are closed
subsets of objects. In the closed itemset mining framework [
        <xref ref-type="bibr" rid="ref12 ref8">8, 12</xref>
        ], G, M , A and B
correspond to the notion of transaction database, set of items (products), closed tidset and
closed itemset respectively. The set of all concepts of K is denoted by B(K). Ordered
by (A; B) (C; D) : () A C, it forms a complete lattice3 called the concept
lattice of K and denoted by B(K). Our working example is on Figure 1.
      </p>
      <p>
        In order to manipulate concept lattices in a similar way as relational tables, we take a
joint database-FCA perspective (see [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]) by using operators similar in spirit to relational
algebra operators (e.g., selection, projection and join) and by exploiting and adapting
existing work related to operations on contexts and concept lattices to analyze and
formalize such operators. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], the authors present an approach for lattice construction
based on the apposition of two contexts K1 and K2 (having the same set of objects).
Such operation is perceived as the opposite operation of the projection and identical to
the relational join operation.
      </p>
      <p>
        Recent studies on pattern management [
        <xref ref-type="bibr" rid="ref1 ref9">1, 9</xref>
        ] provide a uniform framework to data
and pattern management and define links between data and pattern spaces through
3 This is a poset in which every subset X has an infimum (VX) and a supremum (WX). We set
a ^ b := Vfa; bg and a _ b := Wfa; bg.
bridging operations and cross-over queries such as finding data covered by a given
pattern or identifying patterns related to a data set. Although many studies limit the
management of patterns to association rules only, work conducted by Calders et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and
Terrovitis et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] cover different types of patterns. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a pattern base management
system is defined for storing, processing and querying patterns. Moreover, languages
for pattern definition and manipulation are proposed, and temporal aspects of patterns
are handled. In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], a data mining algebra and a 3-World model are defined, as well as a
small set of data mining primitive operators are proposed to further formulate complex
queries. The proposed model includes three worlds: D-World for data definition and
manipulation (e.g., projection, join), I-World for region (set of constraints) definition
and manipulation, and E-World for operations on data contained in regions. In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], a
Data Mining Template Library is defined based on a generic data mining approach for
controlling aspects of pattern mining through a set of properties. On the industry side,
work was mainly done to design languages for pattern description, manipulation and
exchange. This is the case of PMML and SQL/MM [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        To the best of our knowledge, the only previous work on restricting concept sets via
the projection or the selection is the one by Jeudy et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] in which the authors define
a graph representation similar in spirit to Hasse diagrams. The proposed procedure for
projection requires four times traversals of the graph structure (as opposed to two times
for the concept sorting algorithm) and hence is less efficient than our solution. The
lattice resulting from the projection contains in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] only the actual nodes while in our
case it contains all the members of generated equivalence classes with a highlight on
the maximal concept representatives. This could help “switch” easily from the projected
lattice to the initial one (and vice versa) instead of reconstructing the initial lattice from
scratch.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Pattern Restriction in Databases</title>
      <p>A (relational) database is a collection of related tables that can be combined via
relational operations to produce new ones. The main operations are selection, projection
and join. In this paper we consider only the first two operations which are frequently
used unary relational operations. We also assume that T is a (real or virtual) table with
primary key values in G. We denote by M the set of attributes (other than the primary
key) and by W the set of values of attributes in M . Then, the table T is a many-valued
context (G; M; W; I), where I is a map from G M to W such that I(g; m) := m(g),
i.e., the value of the attribute m for the tuple with key-value g.</p>
      <p>For a subset N of M , a projection N on (G; M; W; I), gives the many-valued
context (G; N; WN ; IN ) with IN : G N ! WN W such that IN (g; n) := I(g; n) =
n(g). In practice WN = Sfn(g) j n 2 N; g 2 Gg. Thus, conducting the
projection N on a set P of patterns mined from (G; M; W; I) is equivalent to producing
the corresponding patterns directly from (G; N; WN ; IN ). We denote by N (P) the
so-obtained set of patterns, and called it the projection of P on N . Given P from a
database T , a naive and straightforward way to obtain N (P) will be to conduct the
projection N on (G; M; W; I), and then rerun all the steps done to produce P, but this
time on (G; N; WN ; IN ).</p>
      <p>For a formula ', the selection ' produces a subcontext (G'; M; W'; I'), where
G' is the set of objects in G satisfying ', and I' : G' W with
M ! W'
I'(g; m) := I(g; m)
and</p>
      <p>W' :=
[ fm(g) j g j= 'g:
m2M
Applied to a set of patterns P produced from (G; M; W; I), we get '(P), the set of
patterns that we should mine from (G'; M; W'; I') by applying the same techniques
used to produce P from (G; M; W; I). As above, given P from a database T , one way
to obtain '(P) will be to conduct the selection ' on (G; M; W; I), and then repeat
all the steps done to produce P, but this time on (G'; M; W'; I').</p>
      <p>Conducting restrictions on raw data is however not always possible. Indeed, it may
happen for some reason (e.g., storage constraint) that raw data from which the patterns
were produced are no more available. Therefore, there is a need to explore means to
either produce restricted patterns without returning to raw data, or explore the way to
recover raw data from patterns. In this paper we focus on the first issue and investigate
the way to produce restricted patterns directly from the already existing patterns.</p>
      <p>The problem of restriction on patterns can be summarized in the commutativity of
the diagram on Figure 2 below. Indeed, the analyst can get a pattern set from data using
one of the two ways: (i) conduct a restriction on data followed by a data mining task, or
(ii) handle a data mining task followed by a restriction on produced patterns.</p>
      <p>In inductive databases and pattern management applications, there is no conceptual
difference between data and patterns generated from that data: both can be stored and
queried according to user’s needs and preferences in a uniform way. In the upcoming
sections, we show how a restriction of a lattice to a set of attributes can be conducted.</p>
      <p>The problem of projection on concepts can be stated as follows: given a concept
lattice B := B(G; M; I) and a subset N M of attributes, find the concept lattice</p>
      <p>Database o
Restriction</p>
      <p>Database</p>
      <p>mining
back to data
mining
/ Pattern set</p>
      <p>Restriction
/ Pattern set</p>
      <p>N B when the analysis is restricted to the attributes in N . Similarly, the problem of
selection on concepts is to find the corresponding concept lattice when the analysis is
restricted to a subset of objects. Assuming that we have access to data from which the
lattice was produced 4, one may perform the projection on the data and scale to get
binary contexts and their corresponding concept lattices (projection on contexts). An
alternative is to analyze the effect of the projection on the initial concept lattice and get</p>
      <p>N B(G; M; I )) directly from B(G; M; I ) without returning to the raw data
(projection on concepts). It is interesting to check in which cases a projection on contexts is
more advantageous than a projection on concept lattices.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Projection on Concept Lattices</title>
      <p>For simplicity we will assume that M is a set of scaled attributes. Given a concept
lattice B(G; M; I ) (denoted by B) and a list of attributes N M . The objective is to
produce the concept lattice N (B), where N is the projection on the attributes in N .</p>
      <p>The projection N induces an equivalence relation on B given by
(A1; B1) ' (A2; B2) : ()</p>
      <p>B1 \ N = B2 \ N:
Each equivalence class has a greatest element that can be set as the representative of
that class. The mapping
c : (A; B) 7!</p>
      <p>maxf(U; V ) 2 B j V \ N = B \ N g
is a closure operator on B. The intent of c(A; B) is exactly B \ N and its extent is the
one of the greatest concept in its equivalent class. Therefore, different scenarios can be
considered to find the equivalence classes on the lattices. For example, starting from the
top (or bottom), we can traverse the lattice level-wise and group together the elements
of the same class. The link (covering relation) between class representatives is set up
as follows: ci cj in N B(G; M; I ) iff there is a concept (A; B) in the equivalence
class ci and a concept (C; D) in the equivalence class cj such that (A; B) (C; D) in
B(G; M; I ).</p>
      <p>In the following subsections we present, analyze5 and compare the performance of
a selected set of methods for conducting a projection on concept lattices. The input is
4 When the access to the initial formal context is no more possible, then its reconstruction is
needed if we plan to apply the restriction to the context, and produce thereafter the lattice.
5 The complexity analysis presented here is a preliminary step towards understanding why some
methods perform better in a given configuration. Parameters taken into account include the
number of concepts, the parents/children of a node and the size of equivalence classes.
(1)
(2)
a concept lattice B(G; M; I) and a set of attributes N M . The output is a concept
lattice N (B(G; M; I)). Dually, a selection ' induces an equivalence relation on B
such that each equivalence class has a least element (class representative). Conducting
a selection ' is equivalent to conducting a projection of the dual lattice of B on H,
where H is the set of objects satisfying the formula '.
Conducting a depth-first scan for the projection of a concept lattice on a given set of
attributes can be done as follows
1. Set the first class with the top element 1. Start with that element and follow a path
to the bottom element 0. We get a path 1; a1; a2; : : : ; as; 0. At each node ai (going
downwards), choose a child ai+1 and test if the two nodes are in the same class
(by comparing the nodes ai and ai+1). If they do not belong to the same class, then
create a new membership class for ai+1.
2. Once the bottom element of the lattice is reached, go back to node as and repeat
the following steps at each subsequent node o:
– if the current node o has a child that is not yet marked, then move to this child,
mark it, and find its equivalence class.</p>
      <p>– else go back one step (to the parent from which the node o was reached).
3. Stop the process once all nodes are marked.</p>
      <p>The links between equivalence classes are set up as the scan progresses.</p>
      <p>To analyze the complexity of the procedure above, we consider the number of
accesses to each node and the number of comparisons. Note that each node is visited at
least twice (on the way down and back). If q is the number of equivalence classes, then
there are in average 2q comparisons to mark a node. In fact we have to check if the child
node to classify does not belong to an existing class. In practice, using the hierarchy on
concepts and the convexity6 of equivalence classes, this comparison reduces to classes
of neighboring nodes.
4.2</p>
      <sec id="sec-4-1">
        <title>Breadth-first Search (BFS)</title>
        <p>The idea here is to traverse the lattice from the top in a breadth-first manner, and assign
a different color to a node only if it is the greatest element of its equivalence class. This
avoids look-up into equivalence classes to see if the node has an already assigned color.
This can drastically improve the performance of the algorithm, particularly if there are
many classes. We assume that B(G; M; I) is represented as follows: each node e has
two lists parent(e) and child(e) containing respectively the upper neighbor and the
lower neighbor of e. We proceed as follows:
1. Start with node e := 1 (top element) and mark the representative of the first class.
2. Move to each node in child(e) and compare it with e. If it is not in the same class,
then mark it as a new class. Actually, there is only one way to reach nodes in
child(e) from above.
3. Assume we are at a (marked) current node e. Then we compare it with the
(unmarked) nodes in child(e). Mark with the same color the equivalent nodes. If there
is a node g 2 child(e) that is not equivalent to e, then check whether all nodes in
parent(g) are marked. If so, then a new color is assigned to this node. Else, g is
not marked. In practice, for each node o we keep in parent(o) only the parents that
are not yet marked.
4. Repeat the previous step until e is the bottom.</p>
        <p>To evaluate the complexity of this algorithm, we consider two parameters: the number
of needed comparisons and the number of times each node is accessed. Each node o is
visited exactly #parent(o) + 1 times. Then, the overall access to nodes is
X (#parent(o) + 1) = #B + X #parent(o):
o2B
o2B
If the average number of parents of nodes is p, then the overall number of node accesses
is (p + 1)n, where n is the number of concepts in B. Each node o is compared with all
its lower neighbors. Therefore, the total number of comparisons is pn and the overall
complexity is O(np).
4.3</p>
      </sec>
      <sec id="sec-4-2">
        <title>Leading Bits Sort (LBS)</title>
        <p>
          We have noticed that traversing the lattice to discover equivalence classes and group
the concepts into these classes can be very time consuming. To speed up this process,
we linearly order the nodes so that the equivalence classes can be obtained in a more
efficient way. The benefit of this order is that it avoids a look up in the list of classes and
each node is then visited only once to assign the color corresponding to its equivalence
6 The equivalence classes are convex subsets of the lattice in the sense that if x and z are
equivalent concepts and y is a concept between x and z, then x, y and z are also equivalent.
class. We choose the lectic order as in Next-closure [
          <xref ref-type="bibr" rid="ref3 ref4">4, 3</xref>
          ]. Each subset of attributes is
represented by a sequence of bits (1/0) of fixed length jM j (the cardinality of the set
of attributes). The leading bits are labeled by the attributes in N on which we restrict
the analysis. The remaining bits are labeled by the rest of the attributes (non-relevant to
the present analysis). The lectic order on subsets of M states that A precedes B if the
first position in which A and B differ contains 0 in A and 1 in B, (i.e. the first element,
according to a predefined linear order on M , that distinguishes A and B, is in B). This is
a linear strict order. If all intents are sorted with respect to the lectic order with leading
bits headed by the (relevant) attributes in N , then the equivalent concepts/intents are
necessarily consecutive. In this case we start from the first intent and move downwards
until we get an intent whose leading bits differ from the current one, representing the
current class. Then we start a new class and repeat this until we reach the last element
of the list.
        </p>
        <p>The projection done this way can be divided into two steps. The sorting process
with respect to the lectic order can be done in O(n ln(n)), where n is the number of
concepts in B. The marking of equivalence classes on B is straightforward since there
is one linear pass in the linearly sorted set of concepts. Thus, the overall process has a
complexity of O(n ln(n)). Note that in this case we do not assume any special
representation for B. All we need is the list of concepts so that we can color the equivalence
classes.</p>
        <p>
          If user’s access behavior (both in terms of data and patterns) is known, then the
candidate attributes of the projection may be identified beforehand. If we choose
Nextclosure [
          <xref ref-type="bibr" rid="ref3 ref4">4, 3</xref>
          ] to compute the concepts, then the lectic order will be defined with the
attributes in N at the first positions. This way, the sorting step is no more necessary
since the intents are already sorted in lectic order with leading attributes in front and
less interesting attributes at the end.
        </p>
        <p>In practice the sorting step can be restricted to the jN j leading bits. In fact, we
can define a quasi-order vN on the intent (subsets of M ) as follows: we start with
sorting the base set M in a linear order so that every element of N appears before every
element of M n N , for example M = fm1 &lt; &lt; mn &lt; mn+1 &lt; &lt; mkg,
where N = fm1; mng is the set of attributes on which the projection is to be done.
For two intents A and B, we say that A vN B if and only if A and B have the same
restriction on N or the first element (with respect to the linear order on the base set M )
which distinguishes A \ N from B \ N is in B. The equivalence relation induced by
vN is exactly the one induced by N (see Equation (1) above).
4.4</p>
      </sec>
      <sec id="sec-4-3">
        <title>Bottom-up Search (BUS)</title>
        <p>
          In this method, the idea is to iteratively remove (one by one) the attributes found in the
complement of N and more precisely in N 00 n N . This is in fact the dual procedure
of incremental lattice update procedures [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. To accelerate the process, we start the
exploration of the lattice (upwards from the bottom) with the most general concept c,
whose intent contains N (i.e. c = (N 0; N 00)). In fact, every concept of this lattice has
an equivalent concept in the filter generated by c. Therefore we restrict the search space
to "c.
        </p>
        <p>There are two possibilities: if the concept c has exactly N as intent, then the ideal
generated by c is exactly the equivalence class of 0. Moreover the output of the
projection is the filter "c. If N is not an intent, then the attributes that are in N 00 n N will be
deleted one by one from the intent of concepts in the filter "c. The nodes that have a
same intersection of their intent with N will collapse, leading to a unique concept. The
links between nodes are updated as the exploration progresses. This methods is faster
when the number of attributes to be removed is small or when N is an intent.</p>
        <p>As an illustration, the projection of the lattice shown in Figure 1 on N = fa; b; f g
is done as follows: the most general concept whose intent is larger than N is identified:
c = (f3g; fa; b; e; f; g; Sg). The search space is now the filter "c (with 14 concepts
instead of 26 concepts if the search was to be done on the whole lattice). Then, the
attributes in N 00 n N = fe; g; Sg will be removed one by one from the intents of the
nodes in "c. While removing attribute e from node intents in "c, the concepts (G; ;)
and (f1; 2; 3; 4g; feg) collapse as well as (f1; 2; 3g; fa; eg) and (f1; 2; 3; 5; 6g; fag).
The node (f2; 3g; fa; e; f g) changes to (f2; 3g; fa; f g) but does no collapse with any
other node of "c.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Empirical Study</title>
      <p>In order to compare the performance of the proposed algorithms we have conducted a
set of empirical tests on a Windows XP-based system with 3 GB memory and 1.9 GHz
processor using a Java implementation of the algorithms. The tests aim to compare
the execution time of the four proposed algorithms on four different sizes of concept
lattices. Each concept lattice is produced from a formal context of 50 objects and 50
attributes with different densities. For space limitation, we show the results for only two
lattices of 71 114 (density = 50%) and 234 946 concepts (density = 60%). The execution
time of the projection on the initial context is also provided. Moreover, a variation in
the size of the set N of projection attributes is considered in order to identify the cases
when a projection on concepts is more efficient than a projection on contexts. Each
experiment is conducted twice and the average values of execution time and memory
consumption are stored.</p>
      <p>In Figure 4 the x-axis represents the percentage of the projection attributes and
the y-axis expresses the execution time in seconds (right) or the memory usage in
megabytes (left) in logarithmic scale for five projection procedures: projection on
context (CTX), depth-first search (DFS), breadth-first scan (BFS), leading bits concept
sort (LBS), and bottom-up search (BUS). One can see that conducting the projection
of a concept lattice onto a set N of attributes through a bottom-up exploration with node
pruning becomes an interesting alternative as the number of attributes in N increases
and mainly when their proportion exceeds 50% of the initial set of attributes. In fact the
number of attributes to be removed decreases when the number of projection attributes
increases. The two top-down algorithms (DFS and BFS) have similar performance
values, with a slight advantage in favor the breath-first scan. The leading bits concept sort
is more efficient than DFS and BFS. The gap between the three algorithms increases
as the number of the projection attributes increases and the density increases (leading
to a larger number of children/parents per node). As one can expect, the projection of a</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>
        As indicated earlier, the present work is intimately related to the general and important
problem of pattern management in inductive databases and knowledge discovery
applications. In this paper we have described the general operation of restriction of concept
lattices in order to restrict the analysis of a pattern set to either a set of objects or a set
of attributes. We focused on the projection of a pattern set onto a set of attributes but the
work can be easily adapted to handle a selection on a set of objects. We have proposed,
implemented and tested a set of procedures for the projection of concept lattices on a set
of attributes and analyzed their performance both theoretically and empirically. The first
three methods scan the lattice in a top-down manner without any graph pruning using
either a depth-first search, a breadth-first scan or exploiting concept sorting. The fourth
one uses a bottom-up search and node pruning to focus on most promising concepts.
As we mentioned before the methods perform better that the one in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The proposed
procedure in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] for projection requires four times traversals of the graph structure (as
opposed to two times for the concept sorting algorithm) and hence is less efficient than
our procedures. Although our solution needs more space to store all the nodes of the
initial lattice (see the two alternate representations of the output in Figure 3), it has
the merit to help “switch” easily from the projected lattice to the initial one (and vice
versa) instead of reconstructing the initial lattice from scratch. Due to space limitation
we could not include the detailed steps of the algorithms. The problem we have
considered in this paper can be seen as a preliminary step towards the more general following
issue:
      </p>
      <p>Given a set of patterns P (e.g., implications) and a restriction r, is it convenient
to return to the source data in the formal context K and conduct the restriction
r on it or apply r directly on P?</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgment</title>
      <p>The second and last authors acknowledge the financial support of the Natural Sciences
and Engineering Research Council of Canada (NSERC). All the authors would like to
thank the anonymous referees for their very helpful comments and suggestions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Toon</given-names>
            <surname>Calders</surname>
          </string-name>
          ,
          <string-name>
            <surname>Laks</surname>
            <given-names>V. S.</given-names>
          </string-name>
          <string-name>
            <surname>Lakshmanan</surname>
            , Raymond T. Ng, and
            <given-names>Jan</given-names>
          </string-name>
          <string-name>
            <surname>Paredaens</surname>
          </string-name>
          .
          <article-title>Expressive power of an algebra for data mining</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>31</volume>
          (
          <issue>4</issue>
          ):
          <fpage>1169</fpage>
          -
          <lpage>1214</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Vineet</given-names>
            <surname>Chaoji</surname>
          </string-name>
          , Mohammad Al Hasan,
          <article-title>Saeed Salem, and Mohammed Javeed Zaki. An integrated, generic approach to pattern mining: data mining template library</article-title>
          .
          <source>Data Min. Knowl. Discov.</source>
          ,
          <volume>17</volume>
          (
          <issue>3</issue>
          ):
          <fpage>457</fpage>
          -
          <lpage>495</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Bernhard</given-names>
            <surname>Ganter</surname>
          </string-name>
          .
          <article-title>Two basic algorithms in concept analysis</article-title>
          .
          <source>In Le´onard Kwuida and Baris Sertkaya</source>
          , editors,
          <source>ICFCA</source>
          , volume
          <volume>5986</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>312</fpage>
          -
          <lpage>340</lpage>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Bernhard</given-names>
            <surname>Ganter</surname>
          </string-name>
          and
          <string-name>
            <given-names>Rudolf</given-names>
            <surname>Wille</surname>
          </string-name>
          .
          <source>Formal Concept Analysis: Mathematical Foundations</source>
          . Springer-Verlag New York, Inc.,
          <year>1999</year>
          . Translator-C.
          <year>Franzke</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Baptiste</given-names>
            <surname>Jeudy</surname>
          </string-name>
          , Christine Largeron, and
          <article-title>Franc¸ois Jacquenet. A model for managing collections of patterns</article-title>
          .
          <source>In SAC '07: Proceedings of the 2007 ACM symposium on Applied computing</source>
          , pages
          <fpage>860</fpage>
          -
          <lpage>865</lpage>
          , New York, NY, USA,
          <year>2007</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Heikki</given-names>
            <surname>Mannila</surname>
          </string-name>
          .
          <article-title>Theoretical frameworks for data mining</article-title>
          .
          <source>SIGKDD Explorations</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <fpage>30</fpage>
          -
          <lpage>32</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Rokia</given-names>
            <surname>Missaoui</surname>
          </string-name>
          , Le´onard Kwuida, Mohamed Quafafou, and
          <string-name>
            <given-names>Jean</given-names>
            <surname>Vaillancourt</surname>
          </string-name>
          .
          <article-title>Algebraic operators for querying pattern bases</article-title>
          .
          <source>CoRR, abs/0902.4042</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Nicolas</given-names>
            <surname>Pasquier</surname>
          </string-name>
          , Rafik Taouil, Yves Bastide, Gerd Stumme, and
          <string-name>
            <given-names>Lotfi</given-names>
            <surname>Lakhal</surname>
          </string-name>
          .
          <article-title>Generating a condensed representation for association rules</article-title>
          .
          <source>J. Intell. Inf. Syst.</source>
          ,
          <volume>24</volume>
          (
          <issue>1</issue>
          ):
          <fpage>29</fpage>
          -
          <lpage>60</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Manolis</given-names>
            <surname>Terrovitis</surname>
          </string-name>
          , Panos Vassiliadis, Spiros Skiadopoulos, Elisa Bertino, Barbara Catania, Anna Maddalena, and
          <string-name>
            <given-names>Stefano</given-names>
            <surname>Rizzi</surname>
          </string-name>
          .
          <article-title>Modeling and language support for the management of pattern-bases</article-title>
          .
          <source>Data Knowl. Eng.</source>
          ,
          <volume>62</volume>
          (
          <issue>2</issue>
          ):
          <fpage>368</fpage>
          -
          <lpage>397</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Petko</given-names>
            <surname>Valtchev</surname>
          </string-name>
          and
          <string-name>
            <given-names>Rokia</given-names>
            <surname>Missaoui</surname>
          </string-name>
          .
          <article-title>Building concept (galois) lattices from parts: Generalizing the incremental methods</article-title>
          .
          <source>In ICCS</source>
          , pages
          <fpage>290</fpage>
          -
          <lpage>303</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Petko</surname>
            <given-names>Valtchev</given-names>
          </string-name>
          , Rokia Missaoui, and
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Lebrun</surname>
          </string-name>
          .
          <article-title>A partition-based approach towards constructing galois (concept) lattices</article-title>
          . Discrete Math.,
          <volume>256</volume>
          (
          <issue>3</issue>
          ):
          <fpage>801</fpage>
          -
          <lpage>829</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Mohammed Javeed Zaki and
          <string-name>
            <surname>Ching-Jiu Hsiao</surname>
          </string-name>
          .
          <article-title>Charm: An efficient algorithm for closed itemset mining</article-title>
          .
          <source>In Proceedings of the Second SIAM International Conference on Data Mining</source>
          , Arlington,
          <string-name>
            <surname>VA</surname>
          </string-name>
          , USA, April
          <volume>11</volume>
          -
          <issue>13</issue>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>