<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Advances in algorithms based on CbO</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Petr Krajca</string-name>
          <email>krajcap@inf.upol.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Outrata</string-name>
          <email>jan.outrata@upol.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vilem Vychodil?</string-name>
          <email>vilem.vychodil@upol.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Palacky University</institution>
          ,
          <addr-line>Olomouc</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
          <addr-line>Tr. 17. listopadu 12, 771 46 Olomouc</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <fpage>325</fpage>
      <lpage>337</lpage>
      <abstract>
        <p>The paper presents a survey of recent advances in algorithms for computing all formal concepts in a given formal context which result as modi cations or extensions of CbO. First, we present an extension of CbO, so called FCbO, and an improved canonicity test that signi cantly reduces the number of formal concepts which are computed multiple times. Second, we outline a parallel version of the proposed algorithm and discuss various scheduling strategies and their impact on the overall performance and scalability of the algorithm. Third, we discuss important data preprocessing issues and their in uence on the algorithms. Namely, we focus on the role of attribute permutations and present experimental observations about the e ciency of the proposed algorithms with respect to the number of inversions in such permutations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The major issue of widely-used algorithms for computing formal concepts,
including CbO [12{14], NextClosure [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], or UpperNeighbor [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], is that some
concepts are computed multiple times which brings signi cant overhead. This
paper deals with various ways to reduce the overhead. Notice that recently an
increasing attention has been paid to various modi cations of CbO, see [
        <xref ref-type="bibr" rid="ref1 ref10 ref17">1, 10,
17</xref>
        ].
      </p>
      <p>
        This paper presents a survey of recent advances in three interconnected areas.
First, we present an algorithm called FCbO which achieves better performance
than CbO by reducing the total number of formal concepts that are computed
multiple times. The reduction is achieved by introducing an additional
canonicity test which e ectively prunes the CbO tree during the computation. Second,
we elaborate on issues related to parallel execution of FCbO. We have already
proposed a parallel variant of CbO, so-called PCbO [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In this paper, we
propose an analogous parallelization of FCbO and we discuss various workload
distribution strategies that may have impact on the overall performance of such
parallelization. Third, we focus on data preprocessing|an important issue that
is often underestimated. Namely, some algorithms for FCA (including those from
the CbO family) achieve better performance if attributes are processed in
particular order. In this paper, we present a preliminary study of the role of attribute
permutations on the performance of CbO and the derived algorithms.
Notation Throughout the paper, X = f0; 1; : : : ; mg and Y = f0; 1; : : : ; ng are
nite nonempty sets of objects and attributes, respectively, and I X Y is an
incidence relation. The triplet hX; Y; Ii is a formal context. The concept-forming
operators induced by I will be denoted by "I : 2X ! 2Y and #I : 2Y ! 2X ,
respectively, see [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for details. We assume that reader has knowledge of basic
algorithms for FCA.
2
      </p>
      <p>FCbO: Fast Close-by-One with New Canonicity Test
In this section we brie y describe the new canonicity test and a new algorithm
derived from CbO which uses this test. Recall that the original canonicity test
used by CbO (and NextClosure) is always used after a new formal concept is
computed. For B Y and j 62 B, one checks whether</p>
      <p>B \ Yj = D \ Yj ; where D = (B [ fjg)#I "I
and Yj = fy 2 Y j y &lt; jg. FCbO employs an additional test that is performed
before D is computed, eliminating thus the computation of #I "I . Notice that (1)
fails iff B j 6= ;, where</p>
      <p>
        B
j = (D n B) \ Yj = (B [ fjg)#I "I n B
\ Yj :
(1)
(2)
The new canonicity test exploits the fact that if (1) fails for given B and j 62 B,
the monotony of #I "I yields that the test will also fail for each B0 B such that
j 62 B0 and B j * B0. The conclusion can be done without computing D. If
B j B0, we are still compelled to perform the original canonicity test. Thus,
the new (additional) canonicity test is based on the following assertion:
Lemma 1 (See [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]). Let B Y , j 62 B, and B ; j 6= ;. Then, for each
B0 B such that j 62 B0 and B ; j 6 B0, we have B0 ; j 6= ;. tu
      </p>
      <p>FCbO can be seen as an extended version of CbO in that we propagate the
information about sets (2) which take part in the new test. The information is
propagated in the top-down direction. In order to apply the new test, we have
to change the search strategy of the algorithm from the depth- rst search (as it
is in CbO) to a combined depth- rst and breadth- rst search. FCbO is
represented by a recursive procedure FastGenerateFrom, see Algorithm 1, which
accepts three arguments: a formal concept hA; Bi (an initial formal concept), an
attribute y 2 Y ( rst attribute to be processed), and a set fNy Y j y 2 Y g
of subsets of attributes Y the purpose of which is to carry information about
sets (2). Each invocation of FastGenerateFrom has its own local queue used
to store information about computed concepts. Unlike CbO, if the canonicity
tests succeed (line 7 and line 10), we do not call FastGenerateFrom
recursively but we store information about the concept in the queue (line 11). After
each attribute is processed, we perform the recursive calls, see lines 17{19. The
new canonicity test is performed in line 7 based on information stored in Nj 's.
The original canonicity test is performed in line 10. If the test in line 10 fails, we
update the contents of Mj , see line 13. Note that Mj 's can be seen as local copies</p>
      <p>Algorithm 1: Procedure FastGenerateFrom(hA; Bi, y, fNy j y 2 Y g)
1 list hA; Bi // concept hA; Bi is processed, e.g., listed or stored
// check halting condition of the current call
2 if B = Y or y &gt; n then
3 return
4 end</p>
      <p>// process all attributes beginning with y
5 for j from y upto n do
6 set Mj to Nj // Mj is a pointer to Nj
// perform new canonicity test
7 if j 62 B and Nj \ Yj B \ Yj then</p>
      <p>// compute new concept hC; Di = hA \ fjg#I ; (B [ fjg)#I"I i
8 set C to A \ fjg#I
9 set D to C"I
// perform original canonicity test
if B \ Yj = D \ Yj then
// store new concept for further processing
put hhC; Di; ji to queue
13
14 end
15 end
16 end</p>
      <p>// perform recursive calls of FastGenerateFrom
17 while get hhC; Di; ji from queue do
18 FastGenerateFrom(hC; Di; j + 1; fMy j y 2 Y g)
19 end</p>
      <p>// terminate current call
20 return
// update information about implied attributes
set Mj to D // Mj becomes a pointer to D
of Nj's which are used as the third argument for consecutive calls of
FastGenerateFrom. Sets Nj are used instead of (2) because it is actually easier (and
more e cient) to maintain a set of pointers to intents than to compute (and
allocate memory for) sets (2) during the computation.</p>
      <p>
        FCbO is correct: when invoked with h;#I ; ;#I"I i, y = 0, and fNy = ; j y 2 Y g,
Algorithm 1 lists all formal concepts in hX; Y; Ii in the same order as CbO, each
of them exactly once. Let us note that FCbO can be turned into a \Fast
NextClosure" (i.e., an algorithm that lists concepts in the lexicographical order [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) by
either (i) using a stack instead of a queue or by (ii) modifying the loop in line 5
so that it goes \from n downto y". See [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] for further details on FCbO.
Example 1. Consider a context with X = f0; : : : ; 3g, Y = f0; : : : ; 5g, and I =
fh0; 0i; h0; 1i; h0; 2i; h1; 0i; h1; 2i; h1; 3i; h1; 4i; h1; 5i; h2; 0i; h2; 1i; h2; 4i; h3; 1i; h3; 2ig.
This formal context induces 12 formal concepts denoted C1; : : : ; C12. In case of
both CbO and FCbO, the computation can be depicted by a tree. Moreover, a
hC10, 2i
3 5 4 2
C5
      </p>
      <p>C5</p>
      <p>hC1, 0i
1 3 5 4 2</p>
      <p>0
C8</p>
      <p>C8</p>
      <p>C9
hC12, 3i</p>
      <p>
        Experimental Evaluation We have evaluated FCbO and compared CbO and
FCbO using various real data sets and arti cial data sets. The impact of the new
canonicity test is presented in Table 1 comparing the total numbers of closures
computed by CbO and FCbO in selected benchmark data sets [
        <xref ref-type="bibr" rid="ref2 ref8">2, 8</xref>
        ]. The table
includes numbers of concepts and ratios of the number of computed closures
to the number of (distinct) formal concepts in the data, i.e., the frequency of
successful canonicity tests. Apparently, FCbO has a higher rate of successful
canonicity tests than CbO. Thus, in terms of the number of computed closures,
FCbO is more e cient than CbO. Since the total number of computed closures
directly in uences the speed of the algorithm, FCbO is (usually) faster than
CbO [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The reduction of total time needed for computing all formal concepts
is apparent from Table 2. The table shows total time (in seconds) needed to
mushroom tic-tac-toe debian tags anon. web
size 8; 124 119 958 29 14; 315 475 32; 710 295
density 19 % 34 % &lt; 1 % 1 %
FCbO
      </p>
      <p>CbO</p>
      <p>NextClosure
UpperNeighbor</p>
      <p>
        Berry's [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
analyze the data sets. For the purpose of comparison the table contains also other
well-known algorithms. The experiments were performed on an Apple MacPro
computer equipped with two quad-core processors (Intel Xeon, 2.8 GHz) and
16 GB of RAM and all algorithms were implemented in ANSI C using bitarray
representation [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Notice that in the worst case, FCbO collapses into CbO (e.g.,
in case of I being the inequality relation on X = Y ). FCbO is a polynomial
time-delay algorithm [
        <xref ref-type="bibr" rid="ref7 ref9">7, 9</xref>
        ] because the additional canonicity test has a linear
time-delay overhead compared to CbO, see [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] for further details on FCbO and
its performance.
3
      </p>
      <p>
        PFCbO: Parallel FCbO and Workload Distribution
This section is devoted to parallelization issues of FCbO. Recall that in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
we have described PCbO which results by a parallelization of CbO. FCbO can
be turned into a parallel algorithm in much the same way as the original CbO
can be turned into PCbO. Since the procedure of parallelization is fairly
similar to that presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], we focus mainly on issues that are not discussed
in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Namely, we compare several strategies to balance the workload
distribution among independent processors and compare their e ciency.
      </p>
      <p>
        Following the ideas from [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], a parallel variant of FCbO consists of three
stages: First, we compute and process all concepts that are derivable in less than
L steps. Second, we store all concepts derivable in exactly L steps in a new queue.
Third, we distribute concepts from the queue among P independent processors
and we let each of the processors compute the remaining concepts using FCbO.
Typically, each processor r has its own queue denoted queuer containing concepts
assigned to this processor. A parallel algorithm based on these ideas shall be
called Parallel Fast Close-byOne (PFCbO).
      </p>
      <p>
        Clearly, the practical e ciency of both PCbO and PFCbO depends on the
choice of the strategy that distributes concepts among processors during the
third step of the computation. The decision how to assign concepts to particular
queue is generally di cult since we do not know the distribution of formal
concepts in the search space of all formal concepts until we actually compute them
all and reveal the structure of the call tree. As a consequence, the distribution
of workload may be in some cases unbalanced. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], we have used a simple
round-robin principle which turned out to be reasonably e cient. Nevertheless,
there are other schemes of the workload distribution that can be considered:
(i) round-robin|concepts are distributed to queues attached to each processor,
in the way that n-th concept is placed into a queuer where r = (n mod P )+1
and P is the number of processors. For instance, if we consider P = 4 and
concepts C1; : : : ; C10, they are assigned to queues as follows:
queue1 = fC1; C5; C9g;
queue3 = fC3; C7g;
queue2 = fC2; C6; C10g;
queue4 = fC4; C8g:
(ii) zig-zag|this strategy is similar to the previous strategy but it uses a di erent
formula to determine the queuer. The queuer is given by
r = min n mod z; z
(n mod z) + 1
(3)
where z = 2 P + 1 assuming that P is number of processors. For P = 4
and concepts C1; : : : ; C10 the distribution of concepts is
queue1 = fC1; C8; C9g;
queue3 = fC3; C6g;
queue2 = fC2; C7; C10g;
queue4 = fC4; C5g:
r =
(iii) blocks|this workload distribution scheme divides the queue of all concepts
into chunks of approximately equal size and these \blocks of concepts" are
redistributed into the queues of independent processors. In this case, the
n-th concept is placed into queuer, where
(n P )
      </p>
      <p>Q
with P being the number of processors, Q being the number of all concepts,
and dxe being the usual ceiling function. For instance, in case of C1; : : : ; C10
(i.e., Q = 10) and four queues (i.e., P = 4), we get:
(4)
queue1 = fC1; C2g;
queue3 = fC6; C7g;
queue2 = fC3; C4; C5g;
queue4 = fC8; C9; C10g:
(iv) fair|all concepts remain stored in one shared queue and each processor
gets concepts from the queue one by one. The bene t of this scheme is that
it allows to react on the revealing structure of the call tree. On the other
hand, this method of distributing concepts requires synchronization among
processors while accessing this queue. Note that in contrast to the
abovedescribed schemes, this scheme has no xed structure and the workload is
distributed non-deterministically.
(v) random|the workload is spread among processors randomly. We are
considering this strategy to be referential and it is included for the purpose of
comparison.</p>
      <p>
        Experimental Evaluation In order to evaluate the strategies of workload
distribution, we have tested our algorithm for each strategy using various data sets
and various number of processors. Table 3 depicts the time needed to compute
all formal concepts using particular strategy. Surprisingly, there are only small
random (5000
random (10000
debian tags
anon. web
mushroom
tic-tac-toe
100 10)
100 15)
di erences among the considered schemes of the workload distribution, i.e., the
ordinary round-robin used in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is indeed adequate for the job. Nevertheless,
the fair strategy seems to be the most e cient. One can see that the round-robin
and zig-zag strategies provide performance slightly better than the random
workload distribution. On the other hand, the blocks scheme of distribution provides
performance even worse than the random distribution and seems to be
inappropriate for PFCbO.
4
      </p>
    </sec>
    <sec id="sec-2">
      <title>Data Preprocessing Issues</title>
      <p>
        Algorithms for computing concepts can be classi ed in many ways, see, e.g. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
An important attribute of algorithms for FCA is whether their performance
depends on the order of objects and attributes in the input data table.
Therefore, an algorithm for computing formal concepts shall be called
(permutation) resistant whenever all isomorphic copies of a formal context hX; Y; Ii with
Y = f0; 1; : : : ; ng require the same number of elementary computation steps in
order to compute all concepts. For our purposes, an elementary computation step
will be represented by computation of a single xpoint of the concept-forming
operators "I and #I .
      </p>
      <p>
        One can easily see that, e.g., Lindig's UpperNeighbor algorithm [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] is
resistant. On the other hand, CbO and FCbO are not resistant. Indeed, a di erent
order of attributes in a data table can yield di erent CbO and FCbO trees that
may have di erent numbers of nodes (notice that the loop in line 5 of
Algorithm 1 processes attributes from left to right). Since CbO and FCbO are not
resistant, a proper ordering of attributes before computation can further reduce
the number of concepts that are computed multiple times, thus improving the
e ciency. In this section, we investigate particular permutations of attributes
and explore the impact of inversions on the number of computed closures.
      </p>
      <p>In order to describe various formal contexts with respect to the structure of
the data table, we introduce a notion of an ordered formal context and inversion:
De nition 1. An ordered formal context is a formal context hX; Y; Ii where
Y = f0; : : : ; ng and for all attributes
jf0g#I j
jf1g#I j
jfng#I j:
(5)</p>
      <sec id="sec-2-1">
        <title>A pair of attributes hy1; y2i 2 Y</title>
        <p>an inversion.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Y such that jfy1g#I j 6 jfy2g#I j shall be called</title>
        <p>Verbally, the attributes in an ordered formal context are sorted in the
ascending order according to their support, i.e., the number of objects having
these attributes. As a consequence of the previous de nition, an ordered formal
context contains no inversions.</p>
        <p>From the point of view of formal concepts and concept lattices, the order of
objects and attributes in which they appear in the data table is not essential.
Therefore, one can reorder attributes in an arbitrary way. From the
computational point of view, however, it may happen that certain orderings of attributes
yield better results in conjunction with particular argorithms than other
orderings. In case of our algorithms, the order has an important impact on the process
of the execution of both CbO and FCbO since the canonicity test is driven by
the order of attributes. The following assertions show that for an ordered
formal context with pairwise distinct columns, the canonicity tests succeed for all
attribute concepts. We rst prove a technical claim:
Lemma 2. Let hX; Y; Ii be an ordered formal context with Y = f0; : : : ; ng.
Then, for each k; j 2 Y such that k &lt; j, we have k 2 fjg#I "I iff fkg#I = fjg#I .
Proof. Note that attributes from Y are integers and \&lt;" denotes the usual
strict linear order on the set of all integers. Suppose that k 2 fjg#I "I , i.e.,
fkg fjg#I "I . By the antitony of #I , we get fkg#I fjg#I "I #I = fjg#I . Thus, it
remains to show the converse inclusion. Since hX; Y; Ii is ordered and k &lt; j, we
get jfkg#I j jfjg#I j, see (5). Hence, jfkg#I j jfjg#I j and fkg#I fjg#I yield
fkg#I = fjg#I . Conversely, if fkg#I = fjg#I then obviously k 2 fjg#I "I , proving
the claim.</p>
        <p>Applying Lemma 2, we get:
Theorem 1. Let hX; Y; Ii be an ordered formal context where Y = f0; : : : ; ng
and fag#I 6= fbg#I for any a; b 2 Y . Then for each j 2 Y such that j 62 ;#I "I ,
;#I "I \ Yj = fjg#I "I \ Yj ;
where Yj = fy 2 Y j y &lt; jg.</p>
        <p>Proof. Take j 2 Y such that j 62 ;#I "I . Observe that condition (6) holds true iff
there is no attribute k 2 Y such that k 62 ;#I "I , k &lt; j, and k 2 fjg#I "I . Thus,
consider any k 2 Y such that k &lt; j. Since hX; Y; Ii is ordered, our assumption
j 62 ;#I "I yields k 62 ;#I "I . By the assumption, fkg#I 6= fjg#I , i.e., Lemma 2
yields k 62 fjg#I "I , nishing the proof. tu</p>
        <p>
          Theorem 1 shows that for an ordered formal context with pairwise distinct
columns, invocations of FastGenerateFrom in the rst level of recursion
always succeeds and generates concepts. Moreover, from the proof of Theorem 1
it follows that in any ordered formal context, the rst derivation [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] does not
exists for attribute j if there is an attribute k such that k &lt; j and fkg#I = fjg#I .
tu
(6)
16 · 106
CbO
FCbO
        </p>
        <p>FCbO
0</p>
        <p>This has a practical consequence for the parallel variants of CbO and FCbO
in case of ordered contexts, because it allows us to determine the number of
concepts generated during the rst stages of the algorithms. If the number of
attributes is signi cantly larger than the number of processors, and this condition
is usually ful lled, it is su cient to compute only the rst derivations and then
distribute the workload among all processors.</p>
        <p>
          Furthermore, our empirical experiments have shown an interesting tendency
that while processing ordered formal contexts, canonicity tests fail less frequently
than in case of contexts containing inversions. In addition, the experiments have
shown that with the increasing number of inversions in a data table, the average
number of computed closures grows. For instance, Fig. 2 shows how the number
of inversions in the mushroom data set a ects the total number of computed
closures. The rst graph (at the top) depicts this dependency for CbO and
FCbO. The second graph (at the bottom) provides a more detailed view for
FCbO. Similar tendency can be observed for other benchmark data sets.
Remark 1. Let us note that the ordering of attributes introduced by (5) has
already been used in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] but the purpose of the ordering in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] is di erent. In [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ],
the authors use this particular ordering of attributes in a parallel version of
Ganter's NextClosure algorithm to achieve soundness of the algorithm (each
0 · 107
normal
uniform
0
concept is then listed only once) while in our case, [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] is used for the sake of
increased e ciency.
        </p>
        <p>Remark 2. We have observed a general tendency that certain data sets are more
a ected by the above-discussed phenomenon than others. For instance, if 1's in
a data table are approximately uniformly spread among attributes (i.e., each
attribute has approximately the same support), the ordering of attributes (usually)
does not have a considerable e ect on decreasing the number of closures. Fig. 3
depicts how the increasing number of inversions a ects the number of computed
closures in two arti cial data sets where 1's are distributed (i) approximately
uniformly among the attributes and (ii) approximately normally among the
attributes. Both data sets have the same parameters in that they consist of 1000
objects, 100 attributes, and contain 15 % of 1's, however, the distributions of
1's among the attributes are quite di erent. As one can see from Fig. 3, the
impact of the number of inversions on the number of computed closures is more
signi cant in case of normally distributed 1's among the attributes.
Experimental Evaluation From our observations it follows that it is desirable
to incorporate a preprocessing step which transforms a formal context into a
corresponding ordered formal context. In order to evaluate the bene ts of this
preprocessing step, we have used similar approach as in case of evaluation of
mushroom tic-tac-toe debian tags anon. web
PFCbO (P = 1)
PFCbO (P = 2)
PFCbO (P = 4)
PFCbO (P = 8)</p>
        <p>PCbO (P = 1)
PCbO (P = 2)
PCbO (P = 4)
PCbO (P = 8)
the new canonicity test. We have focused on the total numbers of closures and
concepts computed by FCbO while processing various ordered and unordered
data sets. The results are presented in Table 4 which also includes corresponding
ratios.</p>
        <p>
          Apparently, reordering of attributes reduces the number of computed
closures, and thus, can reduce time of computation. Note that the Ganter's
algorithm [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ] is in principle equivalent to CbO. As such, it is also not permutation
resistant. Thus, the preprocessing step which reorders attributes can also
increase its performance.
5
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Overall Evaluation</title>
      <p>So far, we have proposed and evaluated several improvements and re nements of
the original CbO and PCbO algorithms, namely, new canonicity test, workload
distribution schemes, and reordering attributes. However, we have evaluated the
impact of each improvement separately. Therefore, we conclude this paper with
the evaluation of PFCbO which includes all these improvements.</p>
      <p>Table 5 table shows the total time (in seconds) needed to compute all
formal concepts in the benchmark data sets using PCbO and PFCbO run on the
Apple MacPro computer, equipped with eight processor cores. The parameter
P indicates the number of used processors for particular experiment.</p>
      <p>Fig. 4 demonstrates the scalability of PFCbO, i.e., the ability to decrease the
time of computation by using more processors. In the depicted two experiments,
we have used computer equipped with Sun UltraSPARC T1 processor having
eight cores (each capable to process up to 4 threads simultaneously) and 8 GB
of RAM. Fig. 4 (at the top) shows relative speed up for data sets having 10000
objects, 10 % density of 1's in the data table and various counts of attributes.
Fig. 4 (at the bottom) shows relative speed up for data sets having 1000 objects,
100 attributes, and various densities of 1's in the data tables. The 1's in both
data sets spread approximately normally among the attributes. Note that each
graph contains a certain point from which the increasing number of processors
does not allow to take advantage of more processors and performance of the
number of processors
100 attrs.
150 attrs.
200 attrs.
250 attrs.
100 attrs.
150 attrs.
200 attrs.
algorithm may even decline due to the overhead related to the management
of multiple threads of execution. However, this is a quite common behavior of
parallel algorithms.</p>
      <p>12
16
20
24
28</p>
      <p>32
number of processors</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Andrews</surname>
            <given-names>S.</given-names>
          </string-name>
          : In-Close,
          <article-title>a Fast Algorithm for Computing Formal Concepts</article-title>
          .
          <source>In: Rudolph</source>
          , Dau, Kuznetsov (Eds.):
          <source>Supplementary Proceedings of ICCS '09</source>
          ,
          <string-name>
            <surname>CEUR</surname>
            <given-names>WS</given-names>
          </string-name>
          483(
          <year>2009</year>
          ),
          <volume>14</volume>
          pages. http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-
          <volume>483</volume>
          /paper1.pdf
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Asuncion</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Newman</surname>
            <given-names>D.:</given-names>
          </string-name>
          <article-title>UCI Machine Learning Repository</article-title>
          . University of California, Irvine, School of Information and Computer Sciences,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Berry</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bordat</surname>
            <given-names>J.-P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sigayret</surname>
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A local approach to concept generation</article-title>
          .
          <source>Annals of Mathematics and Arti cial Intelligence</source>
          ,
          <volume>49</volume>
          (
          <year>2007</year>
          ),
          <volume>117</volume>
          {
          <fpage>136</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Fu</surname>
            <given-names>H.</given-names>
          </string-name>
          , Mephu Nguifo E.
          <article-title>: A Parallel Algorithm to Generate Formal Concepts for Large Data</article-title>
          .
          <source>ICFCA</source>
          <year>2004</year>
          , LNCS 2961, pp.
          <volume>394</volume>
          {
          <fpage>401</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ganter</surname>
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Two basic algorithms in concept analysis</article-title>
          .
          <source>(Technical Report FB4- Preprint No. 831)</source>
          .
          <source>TH Darmstadt</source>
          ,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ganter</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Formal concept analysis</article-title>
          .
          <source>Mathematical foundations</source>
          . Berlin: Springer,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Goldberg L</surname>
          </string-name>
          . A.:
          <article-title>E cient Algorithms for Listing Combinatorial Structures</article-title>
          . Cambridge University Press,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hettich</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bay S</surname>
          </string-name>
          . D.: The UCI KDD Archive University of California, Irvine, School of Information and Computer Sciences,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Johnson D. S,
          <string-name>
            <surname>Yannakakis</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papadimitriou</surname>
            <given-names>C. H.</given-names>
          </string-name>
          :
          <article-title>On generating all maximal independent sets</article-title>
          .
          <source>Information Processing Letters</source>
          <volume>27</volume>
          (
          <issue>3</issue>
          )(
          <year>1988</year>
          ),
          <volume>119</volume>
          {
          <fpage>123</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Krajca</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Outrata</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vychodil</surname>
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Parallel Recursive Algorithm for FCA</article-title>
          . In: Belohlavek, Kuznetsov (Eds.):
          <source>Proc. CLA</source>
          <year>2008</year>
          ,
          <source>CEUR WS 433</source>
          (
          <year>2008</year>
          ),
          <volume>71</volume>
          {
          <fpage>82</fpage>
          . http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-
          <volume>433</volume>
          /paper6.pdf
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Krajca</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vychodil</surname>
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Comparison of data structures for computing formal concepts</article-title>
          .
          <source>In: Proc. MDAI</source>
          <year>2009</year>
          , LNAI
          <volume>5861</volume>
          (
          <year>2009</year>
          ),
          <volume>114</volume>
          {
          <fpage>125</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Kuznetsov</surname>
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Interpretation on graphs and complexity characteristics of a search for speci c patterns</article-title>
          .
          <source>Automatic Documentation and Mathematical Linguistics</source>
          ,
          <volume>24</volume>
          (
          <issue>1</issue>
          )(
          <year>1989</year>
          ),
          <volume>37</volume>
          {
          <fpage>45</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kuznetsov</surname>
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A fast algorithm for computing all intersections of objects in a nite semi-lattice (Bystryi$ algoritm postroeni vseh pereseqenii$ obektov iz koneqnoi$ polurexetki</article-title>
          , in Russian).
          <source>Automatic Documentation and Mathematical Linguistics</source>
          ,
          <volume>27</volume>
          (
          <issue>5</issue>
          )(
          <year>1993</year>
          ),
          <volume>11</volume>
          {
          <fpage>21</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Kuznetsov</surname>
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Learning of Simple Conceptual Graphs from Positive and Negative Examples</article-title>
          .
          <source>PKDD</source>
          <year>1999</year>
          , pp.
          <volume>384</volume>
          {
          <fpage>391</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Kuznetsov</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obiedkov</surname>
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Comparing performance of algorithms for generating concept lattices</article-title>
          .
          <source>J. Exp. Theor. Artif. Int.</source>
          ,
          <volume>14</volume>
          (
          <year>2002</year>
          ),
          <volume>189</volume>
          {
          <fpage>216</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lindig</surname>
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Fast concept analysis</article-title>
          .
          <source>Working with Conceptual Structures{ {Contributions to ICCS</source>
          <year>2000</year>
          , pp.
          <volume>152</volume>
          {
          <issue>161</issue>
          ,
          <year>2000</year>
          . Aachen: Shaker Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Outrata</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vychodil</surname>
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Fast algorithm for computing xpoints of galois connections induced by object-attribute relational data (in preparation).</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>