Recognizing Pseudo-Intents is coNP-complete

                     Mikhail A. Babin, Sergei O. Kuznetsov

                   State University Higher School of Economics,
                     Myasnitskaya 20, 101000 Moscow, Russia
                    {mikleb@yandex.ru, skuznetsov@hse.ru}


      Abstract. The problem of recognizing whether a subset of attributes
      is a pseudo-intent is shown to be coNP-hard, which together with the
      previous results means that this problem is coNP-complete. Recogniz-
      ing an essential intent is shown to be NP-complete and recognizing the
      lectically largest pseudo-intent is shown to be coNP-hard.


1    Introduction

One of the long-standing complexity problems in FCA is the problem of checking
whether a given set of attributes is a pseudo-intent. In [4, 5] it was proved that
this problem lies in the class co-NP, however, the question whether the problem
is complete in this class was still open. In [6] there was a conjecture that this
problem is transhyp-hard [6], which would not mean that this problem is co-NP-
complete. In this paper we prove a stronger statement, namely that the problem
is coNP-hard, which, together with the result from [4, 5] means that the problem
is coNP-complete. This main result has several consequences concerning essen-
tial intents and lectically largest pseudo-intent. Recognizing an essential intent is
NP-complete and recognizing the lectically largest pseudo-intent is coNP-hard.
The rest of the paper is organized as follows: In the second section we introduce
the main definitions and give a precise problem statement. In the third section
we give a proof of the main result. In the fourth section we discuss the complex-
ity of some related problems, namely that of recognizing essential intents and
generating pseudo-intents in the order dual to the lectic one.


2    Definitions

Let G and M be sets, called the set of objects and attributes, respectively. Let I
be a relation I ⊆ G × M between objects and attributes: for g ∈ G, m ∈ M, gIm
holds iff the object g has the attribute m. The triple K = (G, M, I) is called
a (formal) context. If A ⊆ G, B ⊆ M are arbitrary subsets, then the Galois
connection is given by the following derivation operators:

                          A′ = {m ∈ M | gIm ∀g ∈ A}

                           B ′ = {g ∈ G | gIm ∀m ∈ B}
                                 Recognizing Pseudo-Intents is coNP-complete          295

    The pair (A, B), where A ⊆ G, B ⊆ M , A′ = B, and B ′ = A is called a
(formal) concept (of the context K) with extent A and intent B (in this case we
have also A′′ = A and B ′′ = B). The set of attributes B is implied by the set of
attributes A, or the implication A → B holds, if all objects from G that have all
attributes from the set A also have all attributes from the set B, i.e. A′ ⊆ B ′ .
    The operation (·)′′ is a closure operator [1], i.e. it is idempotent (X ′′′′ = X ′′ ),
extensive (X ⊆ X ′′ ), and monotone (X ⊆ Y ⇒ X ′′ ⊆ Y ′′ ). Sets A ⊆ G, B ⊆ M
are called closed if A′′ = A and B ′′ = A. Obviously, extents and intents are
closed sets.
    Implications obey the Armstrong rules:
                               A→B              A → B, B ∪ C → D
                          ,          ,                           .
                  A→A         A∪C →B               A∪C →D
A minimal (in the number of implications) subset of implications, from which
all other implications of a context can be deduced by means of the Armstrong
rules was characterized in [3]. This subset is called the Duquenne Guigues or
stem base in the literature. The premises of the implications of the stem base
can be given by pseudo-intents(see e.g.[1]): a set P ⊆ M is a pseudo-intent if
P 6= P ′′ and Q′′ ⊂ P for every pseudo-intent Q ⊂ P . For a closed set A ⊆ M
such that P * A the intersection A ∩ P is also closed (see [1]). A set Q ⊆ M is
called quasi-closed (quasi-intent) if for any R ⊆ Q one has R′′ ⊆ Q or R′′ = Q′′ .
For example closed sets are quasi-closed. For a quasi-closed set Q it holds that
(Q ∩ C)′′ = (Q ∩ C) for any closed set C such that Q * C. Another definition of
a pseudo-intent, which we will use in this paper, is very close to that from [3]:
a nonclosed set P ⊆ M is a pseudo-intent iff P is quasi-closed and Q′′ ⊆ P for
any quasi-closed set Q ⊂ P (see [4, 5]). A set A ⊆ M is called an essential intent
(essential-closed subset of attributes) iff there is a pseudo-intent P ⊆ M such
that P ′′ = A.
    Let G = {g1 , . . . , gn } and M = {m1 , . . . , mn } be sets with same cardinality.
Then the context K = (G, M, I6= ) is called contranominal scale, where I6= =
G × M \ {(g1 , m1 ), . . . , (gn , mn )}. The contranominal scale has the following
property, which we will use later: for any H ⊆ M one has H ′′ = H and H ′ =
{gi | mi ∈
         / H, 1 ≤ i ≤ n}.


3    Recognition of pseudo-intents
Here we discuss the algorithmic complexity of the problem of pseudo-intent
recognition.
Problem: Pseudo-intent recognition (PI)
INPUT: A context K = (G, M, I) and a set P ⊆ M .
QUESTION: Is P a pseudo-intent of K?
   In order to prove coN P -hardness of PI we consider the most well-known
N P -complete problem, namely CNF satisfiability.
296      Mikhail A. Babin, Sergei O. Kuznetsov


Problem: CNF satisfiability (SAT)
INPUT: A boolean CNF formula f (x1 , . . . , xn ) = C1 ∧ . . . ∧ Ck
QUESTION: Is f satisfiable?

Consider an arbitrary CNF instance C1 , . . . , Ck with variables x1 , . . . , xn , where
Ci = (li1 ∨. . .∨lini ) (1 ≤ i ≤ k) are clauses and lij ∈ {x1 , . . . , xn }∪{¬x1 , . . . , ¬xn }
(1 ≤ i ≤ k, 1 ≤ j ≤ ni ) are some variables or their negations, called literals.
From this instance we construct a context K = (G, M, I). Define

                        M = {p, C1 , . . . , Ck , x1 , ¬x1 , . . . , xn , ¬xn , e}

         G = {gx1 , g¬x1 , . . . , gxn , g¬xn , gCX , gC , gl1 , . . . , gln }
                  x                                         ¬x
            ∪ {glij | 1 ≤ i ≤ n, 1 ≤ j ≤ n} ∪ {gli j | 1 ≤ i ≤ n, 1 ≤ j ≤ n}
For 1 ≤ i ≤ n define the set Li = {x1 , ¬x1 , . . . , xn , ¬xn } \ {xi , ¬xi }. In addition
                                                    x                    ¬x
for 1 ≤ i ≤ n and 1 ≤ j ≤ n define the sets Li j = Li \{xj } and Li j = Li \{¬xj }.
    Now we are ready to define I. The relation I is given by two parts. The first
part is
                  I ∩ {gx1 , g¬x1 , . . . , gxn , g¬xn } × M = C ∪ I6=
                      C = {(gxi , Cj ) | xi ∈
                                            / Cj , 1 ≤ i ≤ n, 1 ≤ j ≤ k}
                       ∪ {(g¬xi , Cj ) | ¬xi ∈
                                             / Cj , 1 ≤ i ≤ n, 1 ≤ j ≤ k}
              I6= = {gx1 , g¬x1 , . . . , gxn , g¬xn } × {x1 , ¬x1 , . . . , xn , ¬xn }
                                                                                       ,
                  \ {(gx1 , x1 ), (g¬x1 , ¬x1 ), . . . , (gxn , xn ), (g¬xn , ¬xn )}
hence Ci′ ∩ {gx1 , g¬x1 , . . . , gxn , g¬xn } is the set of objects which correspond to
literals not included in Ci (1 ≤ i ≤ k), and I6= is the relation of the contra-
nominal scale. The rest of I is given by the object intents
                                          ′
                                         gCX = M \ {p, e}
                                     ′
                                    gC = {p} ∪ {C1 , . . . , Ck }
                                    gl′i = {p} ∪ Li , 1 ≤ i ≤ n
                            x ′               x
                           glij = {p} ∪ Li j , 1 ≤ i ≤ n, 1 ≤ j ≤ n
                           ¬x ′               ¬xj
                         gli j = {p} ∪ Li           , 1 ≤ i ≤ n, 1 ≤ j ≤ n
Note that there are some objects (e.g. gl1 and glx11 ) with the same intents, but
this does not matter.
    For any A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } that satisfies A ∩ {xi , ¬xi } =
                                                                              6 ∅ for 1 ≤
i ≤ n, we define truth assignment φA :
                        
                        true,
                                       if xi ∈
                                              / A and ¬xi ∈ A;
              φA (xi ) = f alse, if ¬xi ∈       / A and xi ∈ A;
                        
                         f alse, otherwise (xi ∈ A and ¬xi ∈ A);
                        
                Recognizing Pseudo-Intents is coNP-complete   297


            p C1 C2 · · · Ck x1 ¬x1 · · · xn ¬xn e
 gx1
g¬x1
..
 .                  C               I6=
gxn
g¬xn
gCX              × ······ ×   × ············ ×
 gC         × × ······ ×
 gl1        ×                       L1
glx11       ×                      Lx1 1
gl¬x  1
        1
            ×                      L¬x
                                    1
                                          1

..          ..                       ..
 .           .                        .
 glx1n      ×                      Lx1 n
gl¬x  1
        n
            ×                      L¬x
                                    1
                                          n

   ..        ..                       ..
    .         .                        .
 gln        ×                       Ln
glxn1       ×                      Lxn1
gl¬x  n
        1
            ×                      L¬x
                                    n
                                          1

..            ..                       ..
 .             .                        .
 glxnn      ×                      Lxnn
gl¬x  n
        n
            ×                      L¬x
                                    n
                                          n


                 Table 1. Context K.
298       Mikhail A. Babin, Sergei O. Kuznetsov

In the case xi ∈
               / A and ¬xi ∈       / A for some 1 ≤ i ≤ n, φA is undefined. Note that
for A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } the truth assignment φA is (correctly) defined iff
A * Li for every 1 ≤ i ≤ n.
    Symmetrically for a truth assignment φ define the set Aφ = {¬xi | φ(xi ) =
true} ∪ {xi | φ(xi ) = f alse}.
    Before proving coN P -hardness of PI we prove some auxiliary statements.
The following lemma is crucial for the reduction from SAT to the complement
of PI.
Lemma 1 If a subset A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } is closed and A * gl′i for any
1 ≤ i ≤ n then φA is defined and φA satisfies f i.e f (φA ) = true. Conversely,
if a truth assignment φ satisfies f , then Aφ is closed and Aφ * gl′i for every
1 ≤ i ≤ n.
Proof. Let A ⊆ {x1 , ¬x1 , . . . , xn , ¬xn } and A is not a subset of any gl′i (1 ≤ i ≤
n), then A * Li for any 1 ≤ i ≤ n and hence (by definition of φA ) φA is defined.
Since I6= is the relation of contranominal scale and any intent can be expressed
as the intersection of object intents, we have A′ = {gxi | xi ∈         / A} ∪ {g¬xi | ¬xi ∈    /
A}∪B, where B ⊆ G−{gx1 , g¬x1 , . . . , gxn , g¬xn }. Since A * Li for any 1 ≤ i ≤ n
                            x              ¬x
we also have A * Li j and A * Li j for every 1 ≤ i ≤ n and 1 ≤ j ≤ n. Thus
B = {gCX }.
    Suppose A′′ = A. Then A ∩ {C1 , . . . , Ck } = ∅ and hence for every 1 ≤ i ≤ k
there is some g ∈ A′ that Ci ∈           / g ′ . Since Ci ∈ gCX ′
                                                                    and A′ = {gxi | xi ∈        /
A} ∪ {g¬xi | ¬xi ∈       / A} ∪ {gCX } the latter means that g ∈ {gxi | xi ∈    / A} ∪ {g¬xi |
¬xi ∈ / A}. Then, by definition of the relation C, there is a literal xj ∈               / A or
¬xj ∈ / A that belongs to Ci . Thus φA satisfies Ci for every 1 ≤ i ≤ k.
    Now let φ be a truth assignment and f (φ) = true. Obviously, Aφ * gl′i
for every 1 ≤ i ≤ n (by definition of Aφ ). Then A′φ = {gxi | xi ∈                    / Aφ } ∪
{g¬xi | ¬xi ∈       / Aφ } ∪ {gCX }. Note that A′′φ ∩ {x1 , ¬x1 , . . . , xn , ¬xn } = Aφ ∩
                                           ′
{x1 , ¬x1 , . . . , xn , ¬xn } and Aφ ⊆ gCX     . Hence Aφ is closed iff Aφ ∩{C1 , . . . , Ck } =
∅. Assume that Ci ∈ Aφ ∩ {C1 , . . . , Ck } for some 1 ≤ i ≤ k. This means that
Ci ∈ gx′ j and Ci ∈ g¬x     ′
                              r
                                for every xj ∈/ Aφ and ¬xr ∈  / Aφ . But then by definition
of the relation C the clause Ci is not satisfied by φ.                                          2

Proposition 2 For any 1 ≤ i ≤ n if A ⊆ gl′i then A is closed.
                                                    xj ′           ¬xj ′
Proof. Let A ⊆ gl′i and p ∈ A. Then A′′ = xj ∈A / gli ∩
                                            T             T
                                                                / gl i
                                                            ¬xj ∈A       = A. In
           / A we can express A′′ as A′′ = (A ∪ {p})′′ ∩ gCX
the case p ∈                                              ′
                                                             = A.             2
      Now we are ready to prove coN P -hardness of PI.
Theorem 3 PI is coN P -hard.
Proof. We reduce CNF to the complement of PI. Given a CNF instance f =
C1 ∧ . . . ∧ Ck , we construct a context K like that described above (see Table 1).
We take P = M \ {e} as a set for deciding whether it a pseudo-intent. Hence
the corresponding PI instance is (K, P ) and we prove that f is satisfiable if and
                                      Recognizing Pseudo-Intents is coNP-complete                 299

only if P is not a pseudo-intent of K. Without loss of generality we will assume
that for every 1 ≤ i ≤ n the clause xi ∨ ¬xi is included in f (it does not affect
satisfiability).
    (⇒) Let f be satisfiable and let φ be the truth assignment that satisfies
f (φ) = true. Consider the set Q = {p} ∪ Aφ . As we will see later Q is a pseudo-
intent, Q ⊂ P and Q′′ = M * P , and hence P is not a pseudo-intent. First
let us check that Q′′ = M . Since p ∈ Q we should test only that Q * g ′ ,
                                          x                                       ¬x
where g ∈ {gC , gl1 , . . . , gln } ∪ {glij | 1 ≤ i ≤ n, 1 ≤ j ≤ n} ∪ {gli j | 1 ≤ i ≤
                                        ′
n, 1 ≤ j ≤ n}. Clearly Q * gC because Aφ is not empty. By Lemma 1 for
                                                                            x ′                  ¬x ′
any 1 ≤ i ≤ n, Aφ * gl′i , therefore Q * gli . Hence Q * glij and Q * gli j
(1 ≤ i ≤ n, 1 ≤ j ≤ n). In order to prove that Q is a pseudo-intent we show
that any proper subset of Q is closed. Consider an arbitrary set A ⊂ Q. If p ∈ A
then (since A 6= Q) there is a literal l ∈ {x1 , ¬x1 , . . . , xn , ¬xn } such that l ∈ Q
and l ∈ / A. Thus by proposition 2 the subset A is closed. Now let p ∈                     / A then
if A = Q \ {p} = Aφ by lemma 1 the subset A is closed. If A 6= Q \ {p} then
A ⊂ Aφ and by proposition 2 the subset A is closed.
    (⇐) Now let a pseudo-intent Q be a proper subset of P (i.e. Q ⊂ P ) and
Q′′ * P . Then Q is not a subset of any object intent of K. Together with the
fact of quasi-closedness of Q this implies that Q∩g ′ is closed for any g ∈ G. Note
                                            ′                      ′                    ′
that p ∈ Q since otherwise Q ⊆ gCX              . Consider Q ∩ gC      . Since Q ∩ gC       is closed
                   ′                                                   ′                    ′       ′
and p ∈ Q ∩ gC , there are only two possibilities: Q ∩ gC = p or Q ∩ gC                        = gC   .
                 ′        ′                 ′
Assume Q ∩ gC = gC . Then Q = gC ∪ B, where B ⊂ {x1 , ¬x1 , . . . , xn , ¬xn } and
                                              ′                      ′
B 6= ∅ (because Q 6= P and Q 6= gC              ). Consider Q ∩ gCX        = {C1 , . . . , Ck } ∪ B.
This set must be closed by quasi-closedness of Q. Note that {C1 , . . . , Ck } ∪
B * gl′i , for any 1 ≤ i ≤ n and {C1 , . . . , Ck } ∪ B * gC            ′
                                                                          (since B 6= ∅). Thus
(Q ∩ gCX ) ⊆ {gx1 , g¬x1 , . . . , gxn , g¬xn }. Since (Q ∩ gCX )′ 6= ∅ there is a literal
        ′    ′                                                   ′
                                                           ′
l ∈ {x1 , ¬x1 , . . . , xn , ¬xn } such that gl ∈ (Q ∩ gCX    )′ . Then, by definition of gl′
                                                                                                  ′
and the fact that some clause Ci contains the literal l we get that Ci ∈                / Q ∩ gCX     .
               ′                                   ′
Thus Q ∩ gC = p and Q \ {p} = Q ∩ gCX ⊆ {x1 , ¬x1 , . . . , xn , ¬xn }. Moreover,
Q * gl′i for every 1 ≤ i ≤ n, hence φ = φQ\{p} is (correctly) defined. Since
Q \ {p} is closed by lemma 1, the truth assignment φ satisfies f .                                   2
    In [4] it was shown that P I ∈ coNP hence we obtain
Corollary 1. PI is coN P -complete.


4     Recognizing essential intents and lectically largest
      pseudo-intents

An important problem related to recognizing pseudo-intents is deciding whether
a given set is the lectically largest pseudo-intent.
    Let M = {m1 , . . . , mn } be a finite set with linear order on it (m1 < · · · <
mn ). For sets A ⊆ M and B ⊆ M we say that A lectically smaller than B
(A < B, B is lectically larger than A) if ∃mi ∈ B \ A : A ∩ {mj ∈ M | j < i} =
B ∩ {mj ∈ M | j < i}. It is not hard to see that the lectic order is a linear order
on the subsets of M .
300    Mikhail A. Babin, Sergei O. Kuznetsov

Problem: The lectically largest pseudo-intent (LLPI)
INPUT: A context K = (G, M, I) with linear order on M and a set P ⊆ M .
QUESTION: Is P the lectically largest pseudo-intent of K?


Proposition 4 LLPI is coN P -hard.

Proof. We reduce SAT to the complement of LLPI as in the proof of Theorem 3 .
The linear order on M is: p < C1 < . . . < Ck < x1 < ¬x1 < . . . < xn < ¬xn < e.
Since P = M \ {e} and M is closed, P is the lectically largest pseudo-intent iff
P is a pseudo-intent.                                                         2
Thus it is impossible to find the lectically largest pseudo-intent in polynomial
time unless P = N P .
    In [8] it was shown that pseudo-intents cannot be enumerated with polyno-
mial delay in the lectic order (unless P = N P ). Proposition 4 shows that this
also cannot be done in the dual order, i.e., the following corollary holds.
Corollary. Pseudo-intents cannot be generated with polynomial delay in the
order dual to the lectic one unless P = N P .
Another problem related to the problem of recognizing pseudo-intents is that of
recognizing essential intents.
Problem: Essential intents recognition (EI)
INPUT: A context K = (G, M, I) and a set A ⊆ M .
QUESTION: Is A an essential intent of K?

Proposition 5 EI is NP-complete.

Proof. 1. NP-Hardness. We reduce SAT to EI, in the same way as in the
reduction from SAT, to the complement of PI. Let us construct the context
K2 = (G, M \ {e}, I), where G, M and I are the sets of objects, attributes and
the relation of context K from the proof of Theorem 3 (see Table 1). Obviously,
M \ {e} is an essential intent of K2 iff M \ {e} is not a pseudo-intent of K.
    2. Membership in NP. The set A is an essential intent of the context K =
(G, M, I) iff there is a pseudo-intent P ⊆ M such that P ′′ = A. Since a pseudo-
intent is an inclusion-minimal quasi-closed set with the same closure (e.g. see
[4]), a set A is an essential intent iff there is quasi-closed set Q ⊆ M such that
Q′′ = A. Quasi-closedness can be tested in polynomial time (see [4]). Hence a
nondeterministic guess for checking essential-intent A can be a quasi-closed set
Q such that Q′′ = A.                                                            2.


Conclusion

A long-standing complexity problem about the complexity of recognizing a pseudo-
intent was solved. This problem was shown to be coNP-complete. This main
                                Recognizing Pseudo-Intents is coNP-complete        301

result has several important consequences concerning essential intents and the
lectically largest pseudo-intent. Recognizing an essential intent was shown to be
NP-complete and recognizing the lectically largest pseudo-intent was shown to be
coNP-hard. The latter fact means that pseudo-intents cannot be generated with
polynomial delay in the order dual to the lectic one unless P = N P . Whether
pseudo-intents cannot be generated with polynomial delay (unless P = N P ) in
arbitrary order still remains an important open problem.


References
 1. B. Ganter and R. Wille, Formal Concept Analysis: Mathematical Foundations;
    Springer, Berlin (1999).
 2. M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory
    of NP-Completeness; Freeman, San Francisco (1979).
 3. J. L. Guigues and V. Duquenne, Familles minimales d’implications informatives re-
    sultant d’un tableau de données binaries, Mathématiques, Informatique et Sciences
    Humaines; 95:5-18, (1986).
 4. S. O. Kuznetsov and S. A. Obiedkov, Counting pseudo-intents and #P-
    completeness. In R. Missaoui and J. Schmid, Eds, ICFCA 2006,Lecture Notes in
    Computer Science, vol. 3874, 306-308, Springer-Verlag, (2006).
 5. S. O. Kuznetsov and S. A. Obiedkov, Some decision and counting problems
    of the duquenne-guigues basis of implications, Discrete Applied Mathematics,
    156(11):1994-2003, (2008).
 6. B. Sertkaya, Towards the complexity of recognizing pseudo-intents, In F. Dau and
    S. Rudolph, Eds, ICCS 2009, Lecture Notes in Computer Science, vol. 5662, 284–
    292, Springer-Verlag, (2009)
 7. B. Sertkaya, Some computational problems related to pseudo-intents, In S. Ferre
    and S. Rudolph, Eds, ICFCA 2009, Lecture Notes in Computer Science, vol. 5662,
    284–292, Springer-Verlag, (2009)
 8. F. Distel, Hardness of enumerating pseudo-intents in the lectic order. In L. Kwuida
    and B. Sertkaya, Eds, ICFCA 2010, Lecture Notes in Artificial Intelligence, vol.
    5986, 124–137, Springer-Verlag, (2010)