Recognizing Pseudo-Intents is coNP-complete Mikhail A. Babin, Sergei O. Kuznetsov State University Higher School of Economics, Myasnitskaya 20, 101000 Moscow, Russia {mikleb@yandex.ru, skuznetsov@hse.ru} Abstract. The problem of recognizing whether a subset of attributes is a pseudo-intent is shown to be coNP-hard, which together with the previous results means that this problem is coNP-complete. Recogniz- ing an essential intent is shown to be NP-complete and recognizing the lectically largest pseudo-intent is shown to be coNP-hard. 1 Introduction One of the long-standing complexity problems in FCA is the problem of checking whether a given set of attributes is a pseudo-intent. In [4, 5] it was proved that this problem lies in the class co-NP, however, the question whether the problem is complete in this class was still open. In [6] there was a conjecture that this problem is transhyp-hard [6], which would not mean that this problem is co-NP- complete. In this paper we prove a stronger statement, namely that the problem is coNP-hard, which, together with the result from [4, 5] means that the problem is coNP-complete. This main result has several consequences concerning essen- tial intents and lectically largest pseudo-intent. Recognizing an essential intent is NP-complete and recognizing the lectically largest pseudo-intent is coNP-hard. The rest of the paper is organized as follows: In the second section we introduce the main definitions and give a precise problem statement. In the third section we give a proof of the main result. In the fourth section we discuss the complex- ity of some related problems, namely that of recognizing essential intents and generating pseudo-intents in the order dual to the lectic one. 2 Definitions Let G and M be sets, called the set of objects and attributes, respectively. Let I be a relation I ⊆ G × M between objects and attributes: for g ∈ G, m ∈ M, gIm holds iff the object g has the attribute m. The triple K = (G, M, I) is called a (formal) context. If A ⊆ G, B ⊆ M are arbitrary subsets, then the Galois connection is given by the following derivation operators: A′ = {m ∈ M | gIm ∀g ∈ A} B ′ = {g ∈ G | gIm ∀m ∈ B} Recognizing Pseudo-Intents is coNP-complete 295 The pair (A, B), where A ⊆ G, B ⊆ M , A′ = B, and B ′ = A is called a (formal) concept (of the context K) with extent A and intent B (in this case we have also A′′ = A and B ′′ = B). The set of attributes B is implied by the set of attributes A, or the implication A → B holds, if all objects from G that have all attributes from the set A also have all attributes from the set B, i.e. A′ ⊆ B ′ . The operation (·)′′ is a closure operator [1], i.e. it is idempotent (X ′′′′ = X ′′ ), extensive (X ⊆ X ′′ ), and monotone (X ⊆ Y ⇒ X ′′ ⊆ Y ′′ ). Sets A ⊆ G, B ⊆ M are called closed if A′′ = A and B ′′ = A. Obviously, extents and intents are closed sets. Implications obey the Armstrong rules: A→B A → B, B ∪ C → D , , . A→A A∪C →B A∪C →D A minimal (in the number of implications) subset of implications, from which all other implications of a context can be deduced by means of the Armstrong rules was characterized in [3]. This subset is called the Duquenne Guigues or stem base in the literature. The premises of the implications of the stem base can be given by pseudo-intents(see e.g.[1]): a set P ⊆ M is a pseudo-intent if P 6= P ′′ and Q′′ ⊂ P for every pseudo-intent Q ⊂ P . For a closed set A ⊆ M such that P * A the intersection A ∩ P is also closed (see [1]). A set Q ⊆ M is called quasi-closed (quasi-intent) if for any R ⊆ Q one has R′′ ⊆ Q or R′′ = Q′′ . For example closed sets are quasi-closed. For a quasi-closed set Q it holds that (Q ∩ C)′′ = (Q ∩ C) for any closed set C such that Q * C. Another definition of a pseudo-intent, which we will use in this paper, is very close to that from [3]: a nonclosed set P ⊆ M is a pseudo-intent iff P is quasi-closed and Q′′ ⊆ P for any quasi-closed set Q ⊂ P (see [4, 5]). A set A ⊆ M is called an essential intent (essential-closed subset of attributes) iff there is a pseudo-intent P ⊆ M such that P ′′ = A. Let G = {g1 , . . . , gn } and M = {m1 , . . . , mn } be sets with same cardinality. Then the context K = (G, M, I6= ) is called contranominal scale, where I6= = G × M \ {(g1 , m1 ), . . . , (gn , mn )}. The contranominal scale has the following property, which we will use later: for any H ⊆ M one has H ′′ = H and H ′ = {gi | mi ∈ / H, 1 ≤ i ≤ n}. 3 Recognition of pseudo-intents Here we discuss the algorithmic complexity of the problem of pseudo-intent recognition. Problem: Pseudo-intent recognition (PI) INPUT: A context K = (G, M, I) and a set P ⊆ M . QUESTION: Is P a pseudo-intent of K? In order to prove coN P -hardness of PI we consider the most well-known N P -complete problem, namely CNF satisfiability. 296 Mikhail A. Babin, Sergei O. Kuznetsov Problem: CNF satisfiability (SAT) INPUT: A boolean CNF formula f (x1 , . . . , xn ) = C1 ∧ . . . ∧ Ck QUESTION: Is f satisfiable? Consider an arbitrary CNF instance C1 , . . . , Ck with variables x1 , . . . , xn , where Ci = (li1 ∨. . .∨lini ) (1 ≤ i ≤ k) are clauses and lij ∈ {x1 , . . . , xn }∪{¬x1 , . . . , ¬xn } (1 ≤ i ≤ k, 1 ≤ j ≤ ni ) are some variables or their negations, called literals. From this instance we construct a context K = (G, M, I). Define M = {p, C1 , . . . , Ck , x1 , ¬x1 , . . . , xn , ¬xn , e} G = {gx1 , g¬x1 , . . . , gxn , g¬xn , gCX , gC , gl1 , . . . , gln } x ¬x ∪ {glij | 1 ≤ i ≤ n, 1 ≤ j ≤ n} ∪ {gli j | 1 ≤ i ≤ n, 1 ≤ j ≤ n} For 1 ≤ i ≤ n define the set Li = {x1 , ¬x1 , . . . , xn , ¬xn } \ {xi , ¬xi }. In addition x ¬x for 1 ≤ i ≤ n and 1 ≤ j ≤ n define the sets Li j = Li \{xj } and Li j = Li \{¬xj }. Now we are ready to define I. The relation I is given by two parts. The first part is I ∩ {gx1 , g¬x1 , . . . , gxn , g¬xn } × M = C ∪ I6= C = {(gxi , Cj ) | xi ∈ / Cj , 1 ≤ i ≤ n, 1 ≤ j ≤ k} ∪ {(g¬xi , Cj ) | ¬xi ∈ / Cj , 1 ≤ i ≤ n, 1 ≤ j ≤ k} I6= = {gx1 , g¬x1 , . . . , gxn , g¬xn } × {x1 , ¬x1 , . . . , xn , ¬xn } , \ {(gx1 , x1 ), (g¬x1 , ¬x1 ), . . . , (gxn , xn ), (g¬xn , ¬xn )} hence Ci′ ∩ {gx1 , g¬x1 , . . . , gxn , g¬xn } is the set of objects which correspond to literals not included in Ci (1 ≤ i ≤ k), and I6= is the relation of the contra- nominal scale. The rest of I is given by the object intents ′ gCX = M \ {p, e} ′ gC = {p} ∪ {C1 , . . . , Ck } gl′i = {p} ∪ Li , 1 ≤ i ≤ n x ′ x glij = {p} ∪ Li j , 1 ≤ i ≤ n, 1 ≤ j ≤ n ¬x ′ ¬xj gli j = {p} ∪ Li , 1 ≤ i ≤ n, 1 ≤ j ≤ n Note that there are some objects (e.g. gl1 and glx11 ) with the same intents, but this does not matter. For any A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } that satisfies A ∩ {xi , ¬xi } = 6 ∅ for 1 ≤ i ≤ n, we define truth assignment φA :  true,  if xi ∈ / A and ¬xi ∈ A; φA (xi ) = f alse, if ¬xi ∈ / A and xi ∈ A;  f alse, otherwise (xi ∈ A and ¬xi ∈ A);  Recognizing Pseudo-Intents is coNP-complete 297 p C1 C2 · · · Ck x1 ¬x1 · · · xn ¬xn e gx1 g¬x1 .. . C I6= gxn g¬xn gCX × ······ × × ············ × gC × × ······ × gl1 × L1 glx11 × Lx1 1 gl¬x 1 1 × L¬x 1 1 .. .. .. . . . glx1n × Lx1 n gl¬x 1 n × L¬x 1 n .. .. .. . . . gln × Ln glxn1 × Lxn1 gl¬x n 1 × L¬x n 1 .. .. .. . . . glxnn × Lxnn gl¬x n n × L¬x n n Table 1. Context K. 298 Mikhail A. Babin, Sergei O. Kuznetsov In the case xi ∈ / A and ¬xi ∈ / A for some 1 ≤ i ≤ n, φA is undefined. Note that for A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } the truth assignment φA is (correctly) defined iff A * Li for every 1 ≤ i ≤ n. Symmetrically for a truth assignment φ define the set Aφ = {¬xi | φ(xi ) = true} ∪ {xi | φ(xi ) = f alse}. Before proving coN P -hardness of PI we prove some auxiliary statements. The following lemma is crucial for the reduction from SAT to the complement of PI. Lemma 1 If a subset A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } is closed and A * gl′i for any 1 ≤ i ≤ n then φA is defined and φA satisfies f i.e f (φA ) = true. Conversely, if a truth assignment φ satisfies f , then Aφ is closed and Aφ * gl′i for every 1 ≤ i ≤ n. Proof. Let A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } and A is not a subset of any gl′i (1 ≤ i ≤ n), then A * Li for any 1 ≤ i ≤ n and hence (by definition of φA ) φA is defined. Since I6= is the relation of contranominal scale and any intent can be expressed as the intersection of object intents, we have A′ = {gxi | xi ∈ / A} ∪ {g¬xi | ¬xi ∈ / A}∪B, where B ⊆ G−{gx1 , g¬x1 , . . . , gxn , g¬xn }. Since A * Li for any 1 ≤ i ≤ n x ¬x we also have A * Li j and A * Li j for every 1 ≤ i ≤ n and 1 ≤ j ≤ n. Thus B = {gCX }. Suppose A′′ = A. Then A ∩ {C1 , . . . , Ck } = ∅ and hence for every 1 ≤ i ≤ k there is some g ∈ A′ that Ci ∈ / g ′ . Since Ci ∈ gCX ′ and A′ = {gxi | xi ∈ / A} ∪ {g¬xi | ¬xi ∈ / A} ∪ {gCX } the latter means that g ∈ {gxi | xi ∈ / A} ∪ {g¬xi | ¬xi ∈ / A}. Then, by definition of the relation C, there is a literal xj ∈ / A or ¬xj ∈ / A that belongs to Ci . Thus φA satisfies Ci for every 1 ≤ i ≤ k. Now let φ be a truth assignment and f (φ) = true. Obviously, Aφ * gl′i for every 1 ≤ i ≤ n (by definition of Aφ ). Then A′φ = {gxi | xi ∈ / Aφ } ∪ {g¬xi | ¬xi ∈ / Aφ } ∪ {gCX }. Note that A′′φ ∩ {x1 , ¬x1 , . . . , xn , ¬xn } = Aφ ∩ ′ {x1 , ¬x1 , . . . , xn , ¬xn } and Aφ ⊆ gCX . Hence Aφ is closed iff Aφ ∩{C1 , . . . , Ck } = ∅. Assume that Ci ∈ Aφ ∩ {C1 , . . . , Ck } for some 1 ≤ i ≤ k. This means that Ci ∈ gx′ j and Ci ∈ g¬x ′ r for every xj ∈/ Aφ and ¬xr ∈ / Aφ . But then by definition of the relation C the clause Ci is not satisfied by φ. 2 Proposition 2 For any 1 ≤ i ≤ n if A ⊆ gl′i then A is closed. xj ′ ¬xj ′ Proof. Let A ⊆ gl′i and p ∈ A. Then A′′ = xj ∈A / gli ∩ T T / gl i ¬xj ∈A = A. In / A we can express A′′ as A′′ = (A ∪ {p})′′ ∩ gCX the case p ∈ ′ = A. 2 Now we are ready to prove coN P -hardness of PI. Theorem 3 PI is coN P -hard. Proof. We reduce CNF to the complement of PI. Given a CNF instance f = C1 ∧ . . . ∧ Ck , we construct a context K like that described above (see Table 1). We take P = M \ {e} as a set for deciding whether it a pseudo-intent. Hence the corresponding PI instance is (K, P ) and we prove that f is satisfiable if and Recognizing Pseudo-Intents is coNP-complete 299 only if P is not a pseudo-intent of K. Without loss of generality we will assume that for every 1 ≤ i ≤ n the clause xi ∨ ¬xi is included in f (it does not affect satisfiability). (⇒) Let f be satisfiable and let φ be the truth assignment that satisfies f (φ) = true. Consider the set Q = {p} ∪ Aφ . As we will see later Q is a pseudo- intent, Q ⊂ P and Q′′ = M * P , and hence P is not a pseudo-intent. First let us check that Q′′ = M . Since p ∈ Q we should test only that Q * g ′ , x ¬x where g ∈ {gC , gl1 , . . . , gln } ∪ {glij | 1 ≤ i ≤ n, 1 ≤ j ≤ n} ∪ {gli j | 1 ≤ i ≤ ′ n, 1 ≤ j ≤ n}. Clearly Q * gC because Aφ is not empty. By Lemma 1 for x ′ ¬x ′ any 1 ≤ i ≤ n, Aφ * gl′i , therefore Q * gli . Hence Q * glij and Q * gli j (1 ≤ i ≤ n, 1 ≤ j ≤ n). In order to prove that Q is a pseudo-intent we show that any proper subset of Q is closed. Consider an arbitrary set A ⊂ Q. If p ∈ A then (since A 6= Q) there is a literal l ∈ {x1 , ¬x1 , . . . , xn , ¬xn } such that l ∈ Q and l ∈ / A. Thus by proposition 2 the subset A is closed. Now let p ∈ / A then if A = Q \ {p} = Aφ by lemma 1 the subset A is closed. If A 6= Q \ {p} then A ⊂ Aφ and by proposition 2 the subset A is closed. (⇐) Now let a pseudo-intent Q be a proper subset of P (i.e. Q ⊂ P ) and Q′′ * P . Then Q is not a subset of any object intent of K. Together with the fact of quasi-closedness of Q this implies that Q∩g ′ is closed for any g ∈ G. Note ′ ′ ′ that p ∈ Q since otherwise Q ⊆ gCX . Consider Q ∩ gC . Since Q ∩ gC is closed ′ ′ ′ ′ and p ∈ Q ∩ gC , there are only two possibilities: Q ∩ gC = p or Q ∩ gC = gC . ′ ′ ′ Assume Q ∩ gC = gC . Then Q = gC ∪ B, where B ⊂ {x1 , ¬x1 , . . . , xn , ¬xn } and ′ ′ B 6= ∅ (because Q 6= P and Q 6= gC ). Consider Q ∩ gCX = {C1 , . . . , Ck } ∪ B. This set must be closed by quasi-closedness of Q. Note that {C1 , . . . , Ck } ∪ B * gl′i , for any 1 ≤ i ≤ n and {C1 , . . . , Ck } ∪ B * gC ′ (since B 6= ∅). Thus (Q ∩ gCX ) ⊆ {gx1 , g¬x1 , . . . , gxn , g¬xn }. Since (Q ∩ gCX )′ 6= ∅ there is a literal ′ ′ ′ ′ l ∈ {x1 , ¬x1 , . . . , xn , ¬xn } such that gl ∈ (Q ∩ gCX )′ . Then, by definition of gl′ ′ and the fact that some clause Ci contains the literal l we get that Ci ∈ / Q ∩ gCX . ′ ′ Thus Q ∩ gC = p and Q \ {p} = Q ∩ gCX ⊆ {x1 , ¬x1 , . . . , xn , ¬xn }. Moreover, Q * gl′i for every 1 ≤ i ≤ n, hence φ = φQ\{p} is (correctly) defined. Since Q \ {p} is closed by lemma 1, the truth assignment φ satisfies f . 2 In [4] it was shown that P I ∈ coNP hence we obtain Corollary 1. PI is coN P -complete. 4 Recognizing essential intents and lectically largest pseudo-intents An important problem related to recognizing pseudo-intents is deciding whether a given set is the lectically largest pseudo-intent. Let M = {m1 , . . . , mn } be a finite set with linear order on it (m1 < · · · < mn ). For sets A ⊆ M and B ⊆ M we say that A lectically smaller than B (A < B, B is lectically larger than A) if ∃mi ∈ B \ A : A ∩ {mj ∈ M | j < i} = B ∩ {mj ∈ M | j < i}. It is not hard to see that the lectic order is a linear order on the subsets of M . 300 Mikhail A. Babin, Sergei O. Kuznetsov Problem: The lectically largest pseudo-intent (LLPI) INPUT: A context K = (G, M, I) with linear order on M and a set P ⊆ M . QUESTION: Is P the lectically largest pseudo-intent of K? Proposition 4 LLPI is coN P -hard. Proof. We reduce SAT to the complement of LLPI as in the proof of Theorem 3 . The linear order on M is: p < C1 < . . . < Ck < x1 < ¬x1 < . . . < xn < ¬xn < e. Since P = M \ {e} and M is closed, P is the lectically largest pseudo-intent iff P is a pseudo-intent. 2 Thus it is impossible to find the lectically largest pseudo-intent in polynomial time unless P = N P . In [8] it was shown that pseudo-intents cannot be enumerated with polyno- mial delay in the lectic order (unless P = N P ). Proposition 4 shows that this also cannot be done in the dual order, i.e., the following corollary holds. Corollary. Pseudo-intents cannot be generated with polynomial delay in the order dual to the lectic one unless P = N P . Another problem related to the problem of recognizing pseudo-intents is that of recognizing essential intents. Problem: Essential intents recognition (EI) INPUT: A context K = (G, M, I) and a set A ⊆ M . QUESTION: Is A an essential intent of K? Proposition 5 EI is NP-complete. Proof. 1. NP-Hardness. We reduce SAT to EI, in the same way as in the reduction from SAT, to the complement of PI. Let us construct the context K2 = (G, M \ {e}, I), where G, M and I are the sets of objects, attributes and the relation of context K from the proof of Theorem 3 (see Table 1). Obviously, M \ {e} is an essential intent of K2 iff M \ {e} is not a pseudo-intent of K. 2. Membership in NP. The set A is an essential intent of the context K = (G, M, I) iff there is a pseudo-intent P ⊆ M such that P ′′ = A. Since a pseudo- intent is an inclusion-minimal quasi-closed set with the same closure (e.g. see [4]), a set A is an essential intent iff there is quasi-closed set Q ⊆ M such that Q′′ = A. Quasi-closedness can be tested in polynomial time (see [4]). Hence a nondeterministic guess for checking essential-intent A can be a quasi-closed set Q such that Q′′ = A. 2. Conclusion A long-standing complexity problem about the complexity of recognizing a pseudo- intent was solved. This problem was shown to be coNP-complete. This main Recognizing Pseudo-Intents is coNP-complete 301 result has several important consequences concerning essential intents and the lectically largest pseudo-intent. Recognizing an essential intent was shown to be NP-complete and recognizing the lectically largest pseudo-intent was shown to be coNP-hard. The latter fact means that pseudo-intents cannot be generated with polynomial delay in the order dual to the lectic one unless P = N P . Whether pseudo-intents cannot be generated with polynomial delay (unless P = N P ) in arbitrary order still remains an important open problem. References 1. B. Ganter and R. Wille, Formal Concept Analysis: Mathematical Foundations; Springer, Berlin (1999). 2. M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness; Freeman, San Francisco (1979). 3. J. L. Guigues and V. Duquenne, Familles minimales d’implications informatives re- sultant d’un tableau de données binaries, Mathématiques, Informatique et Sciences Humaines; 95:5-18, (1986). 4. S. O. Kuznetsov and S. A. Obiedkov, Counting pseudo-intents and #P- completeness. In R. Missaoui and J. Schmid, Eds, ICFCA 2006,Lecture Notes in Computer Science, vol. 3874, 306-308, Springer-Verlag, (2006). 5. S. O. Kuznetsov and S. A. Obiedkov, Some decision and counting problems of the duquenne-guigues basis of implications, Discrete Applied Mathematics, 156(11):1994-2003, (2008). 6. B. Sertkaya, Towards the complexity of recognizing pseudo-intents, In F. Dau and S. Rudolph, Eds, ICCS 2009, Lecture Notes in Computer Science, vol. 5662, 284– 292, Springer-Verlag, (2009) 7. B. Sertkaya, Some computational problems related to pseudo-intents, In S. Ferre and S. Rudolph, Eds, ICFCA 2009, Lecture Notes in Computer Science, vol. 5662, 284–292, Springer-Verlag, (2009) 8. F. Distel, Hardness of enumerating pseudo-intents in the lectic order. In L. Kwuida and B. Sertkaya, Eds, ICFCA 2010, Lecture Notes in Artificial Intelligence, vol. 5986, 124–137, Springer-Verlag, (2010)