<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">NextClosures: Parallel Computation of the Canonical Base</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Francesco</forename><surname>Kriegel</surname></persName>
							<email>francesco.kriegel@tu-dresden.de</email>
							<affiliation key="aff0">
								<orgName type="department">Institute of Theoretical Computer Science</orgName>
								<orgName type="institution">TU Dresden</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Daniel</forename><surname>Borchmann</surname></persName>
							<email>daniel.borchmann@tu-dresden.de</email>
							<affiliation key="aff0">
								<orgName type="department">Institute of Theoretical Computer Science</orgName>
								<orgName type="institution">TU Dresden</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">NextClosures: Parallel Computation of the Canonical Base</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">51809C33262F9885294EB7992DFBDCA8</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T05:42+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Formal Concept Analysis</term>
					<term>Canonical Base</term>
					<term>Parallel Algorithms</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The canonical base of a formal context plays a distinguished role in formal concept analysis. This is because it is the only minimal base so far that can be described explicitly. For the computation of this base several algorithms have been proposed. However, all those algorithms work sequentially, by computing only one pseudo-intent at a time -a fact which heavily impairs the practicability of using the canonical base in real-world applications. In this paper we shall introduce an approach that remedies this deficit by allowing the canonical base to be computed in a parallel manner. First experimental evaluations show that for sufficiently large data-sets the speedup is proportional to the number of available CPUs.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The implicational theory of a formal context is of interest in a large variety of applications. In those cases, computing the canonical base of the given context is often desirable, as it has minimal cardinality among all possible bases. On the other hand, conducting this computation often imposes a major challenge, often endangering the practicability of the underlying approach.</p><p>There are two known algorithms for computing the canonical base of a formal context <ref type="bibr" target="#b6">[6,</ref><ref type="bibr" target="#b12">12]</ref>. Both algorithms work sequentially, i.e. they compute one implication after the other. Moreover, both algorithms compute in addition to the implications of the canonical base all formal concepts of the given context. This is a disadvantage, as the number of formal concepts can be exponential in the size of the canonical base. On the other hand, the size of the canonical base can be exponential in the size of the underlying context <ref type="bibr" target="#b10">[10]</ref>. Additionally, up to today it is not known whether the canonical base can be computed in output-polynomial time, and certain complexity results hint at a negative answer <ref type="bibr" target="#b3">[3]</ref>. For the algorithm from <ref type="bibr" target="#b6">[6]</ref>, and indeed for any algorithm that computes the pseudo-intents in a lectic order, it has been shown that it cannot compute the canonical base with polynomial delay <ref type="bibr" target="#b1">[2]</ref>.</p><p>However, the impact of theoretical complexity results for practical application is often hard to access, and it is often worth investigating faster algorithm for theoretically intractable results. A popular approach is to explore the possibilities to parallelize known sequential algorithms. This is also true for formal concept analysis, as can be seen in the development of parallel versions for computing the concept lattice of a formal context <ref type="bibr" target="#b5">[5,</ref><ref type="bibr" target="#b13">13]</ref>.</p><p>In this work we want to investigate the development of a parallel algorithm for computing the canonical base of a formal context K. The underlying idea is actually quite simple, and has been used by Lindig <ref type="bibr" target="#b11">[11]</ref> to (sequentially) compute the concept lattice of a formal context: to compute the canonical base, we compute the lattice of all intents and pseudo-intents of K. This lattice can be computed bottom up, in a level-wise order, and this computation can be done in parallel provided that the lattice has a certain "width" at a particular level. The crucial fact now is that the upper neighbours of an intent or pseudo-intent B can be easily computed by just iterating over all attributes m / ∈ B and computing the closure of B ∪ { m }. In the approach of Lindig mentioned above this closure is just the usual double-prime operation B → B II of the underlying formal context K. In our approach it is the closure operator whose closures are exactly the intents and pseudo-intents of K. Surprisingly, despite the simpleness of our approach, we are not aware of any prior work on computing the canonical base of a formal context in a parallel manner. Furthermore, experimental results presented in this work indicate that for suitable large data-sets the computation of the canonical base can be speed up by a factor proportional to the number of available CPUs.</p><p>The paper is structured as follows. After recalling all necessary notions of formal concept analysis in Section 2, we shall describe in Section 3 our approach of computing the canonical base in parallel. Benchmarks of this algorithm are presented in Section 4, and we shall close this work with some conclusions in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Preliminaries</head><p>This section gives a brief overview on the notions of formal concept analysis <ref type="bibr" target="#b7">[7]</ref> that are used in this document. The basic structure is a formal context K = (G, M, I) consisting of a set G of objects, a set M of attributes, and an incidence relation I ⊆ G × M. For a pair (g, m) ∈ I we also use the infix notation g I m, and say that the object g has the attribute m. Each formal context K induces the derivation operators • I : P(G) → P(M) and • I : P(M) → P(G) that are defined as follows for object sets A ⊆ G and attribute sets B ⊆ M:</p><formula xml:id="formula_0">A I := { m ∈ M | ∀g ∈ A: (g, m) ∈ I } and B I := { m ∈ M | ∀m ∈ B : (g, m) ∈ I } .</formula><p>In other words, A I is the set of all attributes that all objects from A have in common, and dually B I is the set of all objects which have all attributes from B. A formal concept of K is a pair (A, B) such that A I = B and B = A I , and the set of all formal concepts of K is denoted by B(K).</p><p>An implication over the set M is an expression of the form</p><formula xml:id="formula_1">X → Y where X, Y ⊆ M. An implication X → Y over M is valid in K if X I ⊆ Y I . A set L of implications over M is valid in K if each implication in L is valid in K. An implication X → Y follows from the set L if X → Y is valid in every context with attribute set M in which L is valid. Furthermore, a model of X → Y is a set T ⊆ M such that X ⊆ T implies Y ⊆ T . A model of L is a model of all implications in L,</formula><p>and X L is the smallest superset of X that is a model of L. The set X L can be computed as follows.</p><formula xml:id="formula_2">X L := n≥1 X Ln where X L1 := X ∪ { B | A → B ∈ L and A ⊆ X } and X Ln+1 := (X Ln ) L1 for all n ∈ N.</formula><p>The following lemma shows some well-known equivalent statements for entailment of implications from implication sets. We will not prove them here. Lemma 1. Let L ∪ { X → Y } be a set of implications over M. Then the following statements are equivalent:</p><formula xml:id="formula_3">1. X → Y follows from L. 2. If K is a formal context with attribute set M such that L is valid in K, then X → Y is also valid in K. 3. If T ⊆ M and T is a model of L, then T is a model of X → Y . 4. Y ⊆ X L . An attribute set B ⊆ M is called intent of K = (G, M, I) if B = B II . An at- tribute set P ⊆ M is called pseudo-intent of K if P = P II ,</formula><p>and furthermore for each pseudo-intent Q P the set inclusion Q II ⊆ P is satisfied. We denote the set of all pseudo-intents of K by PsInt(K). Then the canonical implicational base of K is defined as the following implication set:</p><formula xml:id="formula_4">{ P → P II | P ∈ PsInt(K) }.</formula><p>The canonical base has the property that it is a minimal base of K, i.e. it is a base of K, meaning that it is a set of valid implications of K such that every valid implication of K is entailed by it. Furthermore, its cardinality is minimal among all bases of K.</p><p>It is readily verified that a subset X ⊆ M is an intent or a pseudo-intent of K if and only if X is a closure of the closure operator K * that is defined as follows:</p><formula xml:id="formula_5">X K * := n≥1 X K * n where X K * 1 := X ∪ { P II | P ∈ PsInt(K) and P X } and X K * n+1 := (X K * n ) K * 1 for all n ∈ N.</formula><p>Of course, if L is the canonical base of K as described above, then both closure operators K * and L * coincide, where L * is defined by the following equations:</p><formula xml:id="formula_6">X L * := n≥1 X L * n where X L * 1 := X ∪ { B | A → B ∈ L and A X } and X L * n+1 := (X L * n ) L * 1 for all n ∈ N.</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Parallel Computation of the Canonical Base</head><p>The well-known NextClosure algorithm developed by Ganter <ref type="bibr" target="#b6">[6]</ref> can be used to enumerate the implications of the canonical base. The mathematical idea behind this algorithm is to compute all intents and pseudo-intents of our formal context K in a certain linear order, namely the lectic order. As an advantage the next (pseudo-)intent is uniquely determined, but we potentially have to do backtracking in order to find it. It can be seen quite easily that those sets form a complete lattice, and the NextClosure algorithm uses the closure operator K * of this lattice to enumerate the pseudo-intents of K in the lectic order. Furthermore, this algorithm is inherently sequential, i.e. it is not possible to parallelize it.</p><p>In our approach we shall not make use of the lectic order. Indeed, our algorithm will enumerate all intents and pseudo-intents of K in the subset-order, with no further restrictions. As a benefit we get a very easy and obvious way to parallelize this enumeration. Moreover, in multi-threaded implementations no communication between different threads is necessary. However, as it is the case with all other known algorithms for computing the canonical base, we also have to compute all intents in addition to all pseudo-intents of the given formal context.</p><p>The main idea is very simple and works as follows. From the definition of pseudointents we see that in order to decide whether an attribute set P ⊆ M is a pseudo-intent we only need all pseudo-intents Q P , i.e. it suffices to know all pseudo-intents with a smaller cardinality than P . This allows for the level-wise computation of the canonical base w.r.t. the subset inclusion order, i.e. we can enumerate the (pseudo-)intents w.r.t. increasing cardinality.</p><p>An algorithm that implements this idea works as follows. First we start by considering the empty set, as it is the only set with cardinality 0. Of course, the empty set must either be an intent or a pseudo-intent, and the distinction can be made by checking whether ∅ = ∅ II . Then assuming inductively that all pseudo-intents with cardinality &lt; k have been determined, we can correctly decide whether a subset P ⊆ M with |P | = k is a pseudo-intent or not.</p><p>To compute the lattice of intents and pseudo-intents of K the algorithm manages a set of candidates that contains the (pseudo-)intents on the current level. Then, whenever a pseudo-intent P has been found, the ⊆-next closure is uniquely determined by its closure P II in the context K. If an intent B has been found, then the ⊆-next closures must be of the form</p><formula xml:id="formula_7">(B ∪ { m }) K * , m / ∈ B.</formula><p>However, as we are not aware of the full implicational base of K yet, but only of an approximation L of it, the operators K * and L * do not coincide on all subsets of M. We will show that they yield the same closure for attribute sets B ⊆ M with a cardinality |B| ≤ k if L contains all implications P → P II where P is a pseudo-intent of K with cardinality |P | &lt; k. Consequently, the L * -closure of a set B ∪ { m } may not be an intent or pseudo-intent of K. Instead they are added to the candidate list, and are processed when all pseudo-intents with smaller cardinality have been determined. We will formally prove that this technique is correct. Furthermore, the computation of all pseudo-intents and intents of cardinality k can be done in parallel, since they are independent of each other.</p><p>In summary, we shortly describe the inductive structure of the algorithm as follows: Let K be a finite formal context. We use four variables: k denotes the current cardinality of candidates, C is the set of candidates, B is a set of formal concepts, and L is an implication set. Then the algorithm works as follows. In order to approximate the operator L * we furthermore introduce the following notion: If L is a set of implications, then L k denotes the subset of L that consists of all implications whose premises have a cardinality of at most k. Lemma 2. Let K = (G, M, I) be a formal context, L its canonical implicational base, and X ⊆ M an attribute set. Then the following statements are equivalent:</p><formula xml:id="formula_8">1. X is either an intent or a pseudo-intent of K. 2. X is K * -closed. 3. X is L * -closed. 4. X is (L |X|−1 ) * -closed. 5. There is a k ≥ |X| − 1 such that X is (L k ) * -closed. 6. For all k ≥ |X| − 1 it holds that X is (L k ) * -closed. Proof. 1⇔2. If X is an intent or a pseudo-intent, then it is obviously K * 1 -closed, i.e. K * -closed. Vice versa, if X is K * -closed,</formula><p>but no intent, then X contains the closure P II of every pseudo-intent P X, and hence X must be a pseudo-intent. 2⇔3. is obvious. 3⇔4. follows directly from the fact that</p><formula xml:id="formula_9">P X implies |P | &lt; |X|. 4⇔5. The only-if-direction is trivial. Consider k ≥ |X| − 1 such that X is (L k ) * -closed. Then X contains all conclusions B where A → B ∈ L is an implication with premise A X such that |A| ≤ k. Of course, A X implies |A| &lt; |X|,</formula><p>and thus X is (L |X|−1 ) * -closed as well. 4⇔6. The only-if-direction is trivial. Finally, assume that k ≥ |X| − 1 and X is (L |X|−1 ) * -closed. Obviously, there are no subsets A X with |X| ≤ |A| ≤ k, and so X must be (L k ) * -closed, too.</p><p>As an immediate consequence of Lemma 2 we infer that in order to decide the K * -closedness of an attribute set X it suffices to know all implications in the canonical base whose premise has a lower cardinality than X. Corollary 3. If L contains all implications P → P II where P is a pseudo-intent of K with |P | &lt; k, and otherwise only implications with premise cardinality k, then for all attribute sets X ⊆ M with |X| ≤ k the following statements are equivalent:</p><formula xml:id="formula_10">1. X is an intent or a pseudo-intent of K. 2. X is L * -closed.</formula><p>This corollary allows us in a certain sense to approximate the set of all K * -closures w.r.t. increasing cardinality, and thus also permits the approximation of the closure operator L * where L is the canonical base of K. In the following Lemma 4 we will characterise the structure of the lattice of all K * -closures, and also give a method to compute upper neighbours. It is true that between comparable pseudo-intents there must always be an intent. In particular, the unique upper K * -closed neighbour of a pseudo-intent must be an intent. Lemma 4. Let K be a formal context. Then the following statements are true:</p><p>1. If P ⊆ M is a pseudo-intent, then there is no intent or pseudo-intent strictly between P and P II . 2. If B ⊆ M is an intent, then the next intents or pseudo-intents are of the form</p><formula xml:id="formula_11">(B ∪ { m }) K * for attributes m ∈ B. 3. If X Y ⊆ M are neighbouring K * -closures, then Y = (X ∪ { m }) K * for all attributes m ∈ Y \ X. Algorithm 1 NextClosures (K) 1 k := 0, C := { ∅ }, B := ∅, L := ∅ 2 while k ≤ |M| do 3 for all C ∈ C with |C| = k do in parallel 4 C := C \ { C } 5 if C = C L * then 6 if C = C II then 7 L := L ∪ { C → C II } 8 B := B ∪ { (C I , C II ) } 9 C := C ∪ { C II ∪ { m } | m ∈ C II } 10 else 11 C := C ∪ { C L * } 12</formula><p>Wait for termination of all parallel processes. 13 k := k + 1 14 return (B, L) Proof. 1. Let P ⊆ M be a pseudo-intent of K. Then for every intent B between P and P II , i.e. P ⊆ B ⊆ P II , we have B = B II = P II . Thus, there cannot be an intent strictly between P and P II . Furthermore, if Q were a pseudo-intent such that P Q ⊆ P II , then P II ⊆ Q, and thus Q = P II , a contradiction. 2. Let B ⊆ M be an intent of K, and X ⊇ B an intent or pseudo-intent of K such that there is no other intent or pseudo-intent between them. Then We are now ready to formulate our algorithm NextClosures in pseudo-code, see Algorithm 1. In the remainder of this section we shall show that this algorithm always terminates for finite formal contexts K, and that it returns the canonical base as well as the set of all formal concepts of K. Beforehand, let us introduce the following notation:</p><formula xml:id="formula_12">B ⊆ B ∪ { m } ⊆ X for every m ∈ X \ B. Thus, B = B K * (B ∪ { m }) K * ⊆ X K * = X. Then (B ∪ { m }) K * is</formula><p>1. NextClosures is in state k if it has processed all candidate sets with a cardinality ≤ k, but none of cardinality &gt; k. 2. C k denotes the set of candidates in state k. 3. L k denotes the set of implications in state k. 4. B k denotes the set of formal concepts in state k. Proposition 5. Let K be a formal context, and assume that NextClosures has been started on K and is in state k. Then the following statements are true:</p><p>1. C k contains all pseudo-intents of K with cardinality k + 1, and all intents of K with cardinality k + 1 whose corresponding formal concept is not already in B k . 2. B k contains all formal concepts of K whose intent has cardinality ≤ k. 3. L k contains all implications P → P II where the premise P is a pseudo-intent of K with cardinality ≤ k. 4. Between the states k and k + 1 an attribute set with cardinality k + 1 is an intent or pseudo-intent of K if and only if it is L * -closed.</p><p>Proof. We prove the statements by induction on k. The base case handles the initial state k = −1. Of course, ∅ is always an intent or a pseudo-intent of K. Furthermore, it is the only attribute set of cardinality 0 and contained in the candidate set C. As there are no sets with cardinality ≤ −1, B −1 and L −1 trivially satisfy Statements 2 and 3. Finally, we have that L −1 = ∅, and hence every attribute set is L * −1 -closed, in particular ∅. We now assume that the induction hypothesis is true for k. For every implication set L between states k and k + 1, i.e. L k ⊆ L ⊆ L k+1 , the induction hypothesis yields that L contains all formal implications P → P II where P is a pseudo-intent of K with cardinality ≤ k, and furthermore only implications whose premises have cardinality k +1 (by definition of Algorithm 1). Additionally, we know that the candidate set C contains all pseudo-intents P of K where |P | = k +1, and all intents B of K such that |B| = k +1 and (B I , B) / ∈ B. Corollary 3 immediately yields the validity of Statements 2 and 3 for k + 1, as those K * -closures are recognized correctly in line 5. Then L k+1 contains all implications P → P II where P is a pseudo-intent of K with |P | ≤ k + 1, and hence each implication set L with L k+1 ⊆ L ⊆ L k+2 contains all those implications and furthermore only implications with a premise cardinality k + 2. By another application of Corollary 3 we conclude that also Statement 4 is satisfied for k + 1.</p><p>Finally, we show Statement 1 for k + 1. Consider any K * -closed set X where |X| = k + 2. Then Lemma 4 states that for all lower K * -neighbours Y and all m ∈ X \ Y it is true that (Y ∪ { m }) K * = X. We proceed with a case distinction.</p><p>If there is a lower K * -neighbour Y which is a pseudo-intent, then Lemma 4 yields that the (unique) next K * -neighbour is obtained as Y II , and the formal concept</p><formula xml:id="formula_13">(Y I , Y II ) is added to the set B in line 8. Of course, it is true that X = Y II .</formula><p>Otherwise all lower K * -neighbours Y are intents, and in particular this is the case for X being a pseudo-intent by Lemma 4. Then for all these Y we have</p><formula xml:id="formula_14">(Y ∪ { m }) K * = X for all m ∈ X \ Y . Furthermore, all sets Z with Y ∪ { m } Z X are not K * -closed.</formula><p>Since X \ Y is finite, the following sequence must also be finite:</p><formula xml:id="formula_15">C 0 := Y ∪ { m } and C i+1 := C L * i where L |Ci|−1 ⊆ L ⊆ L |Ci| .</formula><p>The sequence is well-defined, since implications from L |Ci| \L |Ci|−1 have no influence on the closure of C i . Furthermore, the sequence obviously ends with the set X, and contains no further K * -closed sets, and each of the sets C 0 , C 1 , . . . appears as a candidate during the run of the algorithm, cf. lines 9 and 11.</p><p>From the previous result we can infer that in the last state |M| the set B contains all formal concepts of the input context K, and that L is the canonical base of K. Both sets are returned from Algorithm 1, and hence we can conclude that NextClosures is sound and complete. The following corollary summarises our results obtained so far, and also shows termination. Corollary 6. If the algorithm NextClosures is started on a finite formal context K as input, then it terminates, and returns both the set of all formal concepts and the canonical base of K as output.</p><p>Proof. The second part of the statement is a direct consequence of Proposition 5. In the final state |M| the set L contains all formal implications P → P II where P is a pseudo-intent of K. In particular, L is the canonical implicational base. Furthermore, the set B contains all formal concepts of K.</p><p>Finally, the computation time between states k and k + 1 is finite, because there are only finitely many candidates of cardinality k + 1, and the computation of closures w.r.t. the operators L * and • II can be done in finite time. As there are exactly |M| states for a finite formal context, the algorithm must terminate.</p><p>One could ask whether there are formal contexts that do not allow for a speedup in the enumeration of all intents and pseudo-intents on parallel execution. This would happen for formal contexts whose intents and pseudo-intents are linearly ordered. However, this is impossible. Lemma 7. Let K = (G, M, I) be a non-empty clarified formal context. Then the set of its intents and pseudo-intents is not linearly ordered w.r.t. subset inclusion ⊆.</p><p>Proof. Assume that K := (G, M, I) with G := { g 1 , . . . , g n }, n &gt; 0, were a clarified formal context with intents and pseudo-intents P 1 P 2 . . . P . In particular, then also all object intents form a chain g I 1 g I 2 . . . g I n where n ≤ . Since K is attribute-clarified, it follows g I j+1 \ g I j = 1 for all j, and hence w.l.o.g. M = { m 1 , . . . , m n }, and g i I m j iff i ≥ j. Eventually, K is isomorphic to the ordinal scale K n := ({ 1, . . . , n } , { 1, . . . , n } , ≤). It is easily verified that the pseudointents of K n are either ∅, or of the form { m, n } where m &lt; n − 1, a contradiction.</p><p>Consequently, there is no formal context with a linearly ordered set of intents and pseudo-intents. Hence, a parallel enumeration of the intents and pseudo-intents will always result in a speedup compared to a sequential enumeration.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Benchmarks</head><p>The purpose of this section is to show that our parallel algorithm for computing the canonical base indeed yields a speedup, both qualitatively and quantitatively, compared to the classical algorithm based on NextClosure <ref type="bibr" target="#b6">[6]</ref>. To this end, we shall present the running times of our algorithm when applied to selected data-sets and with a varying number of available CPUs. We shall see that, up to a certain limit, the running time of our algorithms decreases proportional to the number of available CPUs. Furthermore, we shall also show that this speedup is not only qualitative, but indeed yields a real speedup compared to the original sequential algorithm for computing the canonical base.</p><p>The presented algorithm NextClosures has been integrated into Concept Explorer FX <ref type="bibr" target="#b8">[8]</ref>. The implementation is a straight-forward adaption of Algorithm 1 to the programming language Java 8, and heavily uses the new Stream API and thread-safe concurrent collection classes (like ConcurrentHashMap). As we have described before, the processing of all candidates on the current cardinality level can be done in parallel, i.e. for each of them a separate thread is started that executes the necessary operations for lines 4 to 11 in Algorithm 1. Furthermore, as the candidates on the same level cannot affect each other, no communication between the threads is needed. More specifically, we have seen that the decision whether a candidate is an intent or a pseudo-intent is independent of all other sets with the same or a higher cardinality.</p><p>The formal contexts used for the benchmarks<ref type="foot" target="#foot_0">1</ref> are listed in Figure <ref type="figure" target="#fig_0">1</ref>, and are either obtained from the FCA Data Repository [4] ( a to d , and f to p ), randomly created ( q to n ), or created from experimental results ( e ). For each of them we executed the implementation at least three times, and recorded the average computation times.</p><p>The experiments were performed on the following two systems:</p><p>Taurus (1 Node of Bull HPC-Cluster, ZIH) CPUs: 2x Intel Xeon E5-2690 with eight cores @ 2.9 GHz, RAM: 32 GB Atlas (1 Node of Megware PC-Farm, ZIH)</p><p>CPUs: 4x AMD Opteron 6274 with sixteen cores @ 2.2 GHz, RAM: 64 GB</p><p>The benchmark results are displayed in Figure <ref type="figure">2</ref>. The charts have both axes logarithmically scaled, to emphasise the correlation between the execution times and the number of available CPUs. We can see that the computation time is almost inverse linear proportional to the number of available CPUs, provided that the context is large enough. In this case there are enough candidates on each cardinality level for the computation to be done in parallel. However, we shall note that there are some cases where the computation times increase when utilising all available CPUs. We are currently not aware of an explanation for this exception -maybe it is due to some technical details of the platforms or the operation systems, e.g. some background tasks that are executed during the benchmark, or overhead caused by thread maintenance. Note that we did not have full system access during the experiments, but could only execute tasks by scheduling them in a batch system. Additionally, for some of the test contexts only benchmarks for a large number of CPUs could be performed, due to the time limitations on the test systems.</p><p>Furthermore, we have performed the same benchmark with small-sized contexts having at most 15 attributes. The computation times were far below one second. We have noticed that there is a certain number of available CPUs for which there is no </p><formula xml:id="formula_16">f f f f f f f f g g g g g g g g g g g g h h h h h h h h h h h h i i i i i i j j j j j j j j j k k k</formula><p>l l l m p q q q q q q q q q q q q r r r  further increase in speed of the algorithm. This happens when the number of candidates is smaller than the available CPUs. Finally, we compared our two implementations of NextClosure and NextClosures when only one CPU is utilised. The comparison was performed on a notebook with Intel Core i7-3720QM CPU with four cores @ 2.6 GHz and 8 GB RAM. The results are shown in Figure <ref type="figure">3</ref>. We conclude that our proposed algorithm is on average as fast as NextClosure on the test contexts. The computation time ratio is between 1  3 and 3, depending on the specific context. Low or no speedups are expected for formal contexts where NextClosure does not have to do backtracking, and hence can find the next intent or pseudo-intent immediately.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Number of CPUs</head><note type="other">Computation Time</note></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>In this paper we have introduced the parallel algorithm NextClosures for the computation of the canonical base. It constructs the lattice of all intents and pseudo-intents of a given formal context from bottom to top in a level-wise order w.r.t. increasing cardinality. As the elements in a certain level of this lattice can be computed independently, they can also be enumerated in parallel, thus yielding a parallel algorithm for computing the canonical base. Indeed, first benchmarks show that NextClosures allows for a speedup that is proportional to the number of available CPUs, up to a certain natural limit. Furthermore, we have compared its performance to the well-known algorithm NextClosure when utilising only one CPU. It could be observed that on average our algorithm (on one CPU) has the same performance as NextClosure, at least for the test contexts.</p><p>So far we have only introduced the core idea of the algorithm, but it should be clear that certain extensions are possible. For example, it is not hard to see that our parallel algorithm can be extended to also handle background knowledge given as a set of implications or as a constraint closure operator <ref type="bibr" target="#b0">[1]</ref>. In order to yield attribute exploration, our algorithm can also be extended to include expert interaction for exploration of the canonical base of partially known contexts, much in the same way as the classical algorithm. One benefit is the possibility to have several experts answering questions in parallel. Another advantage is the constant increase in the difficulty of the questions (i.e. premise cardi-nality), compared to the questions posed by default attribute exploration in lectic order. Those extensions have not been presented here due to a lack of space, but we shall present them in a future publication. Meanwhile, they can be found in a technical report <ref type="bibr" target="#b9">[9]</ref>.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>1 .</head><label>1</label><figDesc>Set k := 0, C := { ∅ }, B := ∅, and L := ∅. 2. In parallel: For each candidate set C ∈ C with cardinality |C| = k determine whether it is L * -closed. If not, then add its L * -closure to the candidate set C, and go to Step 5. 3. If C is an intent of K, then add the formal concept (C I , C) to B. Otherwise C must be a pseudo-intent, and thus we add the formal implication C → C II to the set L, and add the formal concept (C I , C II ) to the set B. 4. For each observed intent C II , add all its upper neighbours C II ∪ { m } where m / ∈ C II to the candidate set C. 5. Wait until all candidates of cardinality k have been processed. If k &lt; |M|, then increase the candidate cardinality k by 1, and go to Step 2. Otherwise return B and L.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>an intent or a pseudo-intent between B and X that strictly contains B, and hence X = (B ∪ { m }) K * . 3. Consider an attribute m ∈ Y \ X. Then X ∪ { m } ⊆ Y , and thus X (X ∪ { m }) K * ⊆ Y , as Y is already closed. Therefore, (X ∪ { m }) K * = Y .</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 2 .Fig. 3 .</head><label>23</label><figDesc>Fig. 2. Benchmark Results (left: Atlas, right: Taurus)</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Readers who are interested in the test contexts should send a mail to one of the authors.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1">NextClosures: Parallel Computation of the Canonical Base</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_2">NextClosures: Parallel Computation of the Base</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgements The authors thank Bernhard Ganter for helpful hints on optimal formal contexts for his NextClosure algorithm. Furthermore, the authors thank the anonymous reviewers for their constructive comments.</p><p>The benchmarks were performed on servers at the Institute of Theoretical Computer Science, and the Centre for Information Services and High Performance Computing (ZIH) at TU Dresden. We thank them for their generous allocations of computer time.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Formal Concept Analysis with Constraints by Closure Operators</title>
		<author>
			<persName><forename type="first">Radim</forename><surname>Belohlávek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vilém</forename><surname>Vychodil</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Conceptual Structures: Inspiration and Application, 14th International Conference on Conceptual Structures, ICCS 2006</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">Henrik</forename><surname>Schärfe</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Pascal</forename><surname>Hitzler</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Peter</forename><surname>Øhrstrøm</surname></persName>
		</editor>
		<meeting><address><addrLine>Aalborg, Denmark</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">July 16-21, 2006. 2006</date>
			<biblScope unit="volume">4068</biblScope>
			<biblScope unit="page" from="131" to="143" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Hardness of Enumerating Pseudo-Intents in the Lectic Order</title>
		<author>
			<persName><forename type="first">Felix</forename><surname>Distel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 8th Interational Conference of Formal Concept Analysis</title>
				<meeting>the 8th Interational Conference of Formal Concept Analysis<address><addrLine>Agadir, Morocco</addrLine></address></meeting>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">Ed</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>Léonard Kwuida and Barış Sertkaya</editor>
		<imprint>
			<biblScope unit="volume">5986</biblScope>
			<biblScope unit="page" from="124" to="137" />
			<date type="published" when="2010">2010</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">On the Complexity of Enumerating Pseudo-Intents</title>
		<author>
			<persName><forename type="first">Felix</forename><surname>Distel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Barış</forename><surname>Sertkaya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Discrete Applied Mathematics</title>
		<imprint>
			<biblScope unit="volume">159</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="450" to="466" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<ptr target="http://www.fcarepository.com" />
		<title level="m">FCA Data Repository</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A Parallel Algorithm to Generate Formal Concepts for Large Data</title>
		<author>
			<persName><forename type="first">Huaiguo</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Engelbert</forename><forename type="middle">Mephu</forename><surname>Nguifo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second International Conference on Formal Concept Analysis</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">W</forename><surname>Peter</surname></persName>
		</editor>
		<editor>
			<persName><surname>Eklund</surname></persName>
		</editor>
		<meeting>the Second International Conference on Formal Concept Analysis<address><addrLine>Sydney, Australia</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2004">2961. 2004</date>
			<biblScope unit="page" from="394" to="401" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Two Basic Algorithms in Concept Analysis</title>
		<author>
			<persName><forename type="first">Bernhard</forename><surname>Ganter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 8th Interational Conference of Formal Concept Analysis</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">Léonard</forename><surname>Kwuida</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Barış</forename><surname>Sertkaya</surname></persName>
		</editor>
		<meeting>the 8th Interational Conference of Formal Concept Analysis<address><addrLine>Agadir, Morocco</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="volume">5986</biblScope>
			<biblScope unit="page" from="312" to="340" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Formal Concept Analysis: Mathematical Foundations</title>
		<author>
			<persName><forename type="first">Bernhard</forename><surname>Ganter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rudolf</forename><surname>Wille</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Concept Explorer FX. Software for Formal Concept Analysis</title>
		<author>
			<persName><forename type="first">Francesco</forename><surname>Kriegel</surname></persName>
		</author>
		<idno>2010- 2015</idno>
		<ptr target="https://github.com/francesco-kriegel/conexp-fx" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">NextClosures -Parallel Exploration of Constrained Closure Operators</title>
		<author>
			<persName><forename type="first">Francesco</forename><surname>Kriegel</surname></persName>
		</author>
		<idno>15-01</idno>
	</analytic>
	<monogr>
		<title level="m">Chair for Automata Theory</title>
				<meeting><address><addrLine>TU Dresden</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
	<note type="report_type">LTCS-Report</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">On the Intractability of Computing the Duquenne-Guigues Base</title>
		<author>
			<persName><forename type="first">Sergei</forename><forename type="middle">O</forename><surname>Kuznetsov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Universal Computer Science</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="927" to="933" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Fast Concept Analysis</title>
		<author>
			<persName><forename type="first">Christian</forename><surname>Lindig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working with Conceptual Structures -Contributions to ICCS 2000</title>
				<editor>
			<persName><forename type="first">Gerhard</forename><surname>Stumme</surname></persName>
		</editor>
		<meeting><address><addrLine>Aachen, Germany</addrLine></address></meeting>
		<imprint>
			<publisher>Shaker Verlag</publisher>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="152" to="161" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Attribute-Incremental Construction of the Canonical Implication Basis</title>
		<author>
			<persName><forename type="first">Sergei</forename><forename type="middle">A</forename><surname>Obiedkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vincent</forename><surname>Duquenne</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Annals of Mathematics and Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="page" from="77" to="99" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Parallel Recursive Algorithm for FCA</title>
		<author>
			<persName><forename type="first">Vilém</forename><surname>Vychodil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Petr</forename><surname>Krajča</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jan</forename><surname>Outrata</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 6th International Conference on Concept Lattices and Their Applications</title>
				<editor>
			<persName><forename type="first">Radim</forename><surname>Bělohlávek</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">O</forename><surname>Sergej</surname></persName>
		</editor>
		<editor>
			<persName><surname>Kuznetsov</surname></persName>
		</editor>
		<editor>
			<persName><surname>Palacký University</surname></persName>
		</editor>
		<meeting>the 6th International Conference on Concept Lattices and Their Applications<address><addrLine>Olomouc</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="71" to="82" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
