=Paper= {{Paper |id=Vol-2123/paper20 |storemode=property |title=Pattern Setups and Their Completions |pdfUrl=https://ceur-ws.org/Vol-2123/paper20.pdf |volume=Vol-2123 |authors=Aimene Belfodil, Sergei O. Kuznetsov,Mehdi Kaytoue |dblpUrl=https://dblp.org/rec/conf/cla/BelfodilKK18 }} ==Pattern Setups and Their Completions== https://ceur-ws.org/Vol-2123/paper20.pdf
           Pattern Setups and Their Completions

          Aimene Belfodil1,4 , Sergei O. Kuznetsov2,3 , and Mehdi Kaytoue1,5
      1
        Univ Lyon, INSA Lyon, CNRS, LIRIS UMR 5205, F-69621, LYON, France
      2
        National Research University Higher School of Economics, Moscow, Russia
  3
    St. Petersburg Department of Steklov Mathematical Institute of Russian Academy
                                     of Sciences, Russia
      4
         Mobile Devices Ingénierie, 100 Avenue Stalingrad, 94800, Villejuif, France
            5
              Infologic, 99 avenue de Lyon, 26500 Bourg-Lès-Valence, France
                firstname.surname@insa-lyon.fr, skuznetsov@hse.ru



          Abstract. Pattern mining consists in discovering interesting patterns
          in data. For that, algorithms rely on smart techniques for enumerating
          the pattern search space and, generally, focus on compressed collections
          of patterns (e.g. closed patterns), to avoid redundancy. Formal Concept
          Analysis offers a generic framework, called pattern structures, to formal-
          ize many types of patterns, such as closed itemsets, intervals, graph and
          sequence sets. Additionally, it provides generic algorithms to enumerate
          all closed patterns and only them. The only condition is that the pattern
          space is a meet-semilattice, which, unfortunately does not always hold
          (e.g., for sequential and graph patterns). In this paper, we discuss pattern
          setups, a tool that models pattern search spaces relying only on posets.
          Subsequently, we revisit some techniques transforming pattern setups to
          a pattern structure using set of patterns, namely completions, and we
          state a necessary and sufficient condition for a pattern setup completion
          using antichains to be a pattern structure.


  1    Introduction
  Pattern mining is the task of finding interesting patterns in a predefined search
  space. A generic tool for defining such a pattern search space is pattern struc-
  tures [10,14] based on Formal Concept Analysis (FCA) [9]. Indeed, itemsets,
  interval [12], convex [3], partition [2] pattern spaces among others can be mod-
  eled within the pattern structure framework. However, since pattern structures
  rely on meet-semilattices, some pattern spaces that are only posets cannot be
  “directly” defined using such a tool.
      Fig. 1 depicts a dataset of 4 objects described by attribute "value" and la-
  beled positive or negative. Consider the task of finding “good” rules d → + in this
  dataset with d a description given by attribute value. Rather than considering
  the usual meet-semilattice of intervals [12]; descriptions d are restrained to inter-
  vals of the form (v] and [v) or singleton {v} ⊆ R (see Fig. 1 - right). Descriptions
  form a poset (D, ⊇) but not a meet-semilattice. For instance, the set {{3}, {5}}
  has not a meet, since lower bounds of {{3}, {5}} has two maximal elements w.r.t.
  ⊇ (i.e. [3) and (5]). Hence, the triple (G, (D, ⊇), δ) with δ : g 7→ {value(g)} does

c paper author(s), 2018. Proceedings volume published and copyrighted by its editors.
  Paper published in Dmitry I. Ignatov, Lhouari Nourine (Eds.): CLA 2018, pp.
  243–253, Department of Computer Science, Palacký University Olomouc, 2018.
  Copying permitted only for private and academic purposes.
244       Aimene Belfodil, Sergei O. Kuznetsov, and Mehdi Kaytoue


      object value label                                Description language:
      g1     1     −            g1 g2 g3         g4
                                                             value = v | v ∈ R
      g2     3     +
      g3     5     +             1   3   5       9           value ≤ v
      g4     9     −                                         value ≥ v

Fig. 1. Dataset (left), its representation in R with black dots representing positive
objects (center) and the description language (right).

not form a pattern structure [10] since {δ(g2 ), δ(g3 )} = {{3}, {5}} has not a
meet. It does form actually a pattern setup [16] which is based on a poset.
    Description spaces like the one in Fig. 1 are numerous. For instance, sequen-
tial patterns [1] ordered by “is subsequence of” do not form a meet-semilattice
[19] [7]. Sequential meet-semilattice in FCA [6] [7] refers usually to set of se-
quences rather than to the poset of sequences. Same holds for the graph
meet-semilattice from [13]. In general, the base pattern setup is transformed to
a pattern structure using sets of descriptions thus providing a richer space (lan-
guage). Such transformations are naturally called completions. For example, in
Fig. 1, restriction {value ≥ 3, value ≤ 5} is equivalent to 3 ≤ value ≤ 5 and
does not belong to the base description language.
    Understanding properties of pattern setups independently from their com-
pletions is fundamental for answering many practical questions. For instance,
consider the question “What are the best descriptions covering all positive in-
stances?”. If better stands for more relevant than as in relevance theory
[11], the answer will be the two best incomparable rules value ≥ 3 → + and
value ≤ 5 → + rather than only one in the completion 3 ≤ value ≤ 5 → +.
    In this paper, after recalling basic facts on pattern structures in Section 2,
we discuss the properties of a pattern setup in Section 3. Subsequently, Section 4
revisits pattern setup completions and states a necessary and sufficient condition
on a pattern setup to augment it to a pattern structure using antichains of
patterns [13,6,7]. We coin pattern setups verifying such a condition pattern hyper-
structures. The basic order-theoretic concepts we use in this paper are well-
presented in [18] and [9]. Table 1 resumes the used notations.

                                Table 1. Notations

   ℘(E)   Powerset of E    {S | S ⊆ E}
   f [S]  Image of S by f {f (s) | s ∈ S} ⊆ F with S ⊆ E and f : E → F
   (D, v) Poset            partially ordered set. Below S ⊆ D is a subset
   ↓S     Down closure     {d ∈ D | (∃s ∈ S) d v s}. For s ∈ S : ↓ {s} , ↓ s
   ↑S     Up closure       {d ∈ D | (∃s ∈ S) s v d}. For s ∈ S : ↑ {s} , ↑ s
   S`     Lower bounds     {d ∈ D | (∀s ∈ S) d v s}. For s ∈ S : {s}` , s`
   S u
          Upper bounds     {d ∈ D | (∀s ∈ S) s v d}. For s ∈ S : {s}u , su
   min(S) Minimal elements {s ∈ S | d ∈ S, d v s ⇒ d = s}
   max(S) Maximal elements {s ∈ S | d ∈ S, s v d ⇒ d = s}
   d                                                                       d
   FS     Meet             greatest element of S ` if exists (d ∈ S ` ⇔ dFv S)
                                                  u                  u
      S   Join             smallest element of S if exists (d ∈ S ⇔ S v d)
                                     Pattern Setups and Their Completions      245


2   Basic Definitions
In general, pattern search spaces are formalized by partially ordered sets. In this
paper, we call description space (language) or pattern space (language)
any poset D := (D, v). Elements of D are called descriptions or patterns. For
any c, d ∈ D, c v d is read “c subsumes d” or “c is less restrictive than d”.
    Pattern structures is an extension of the basic FCA model [10]. Objects in
a pattern structure have descriptions in a meet-semilattice. Pattern setups [16]
generalize pattern structures by demanding only a partial order on descriptions.
Definition 1. A pattern setup is a triple P = (G, D, δ) where G is a set of
objects, D is a description space and δ : G → D is a mapping that takes each
object g ∈ G to a description δ(g) ∈ D. We say that an object g ∈ G realizes a
description d ∈ D or d covers g if d v δ(g). A pattern setup P is said to be a
pattern structure if every subset of δ[G] = {δ(g) | g ∈ G} has a meet in D.
   Note that a necessary and sufficient condition on D to have a pattern struc-
ture (G, D, v) on any finite set of objects G and any mapping δ : G → D is that
D is a meet-semilattice with a top element (D has its greatest element).
   In pattern structures, two derivation operators map posets (℘(G), ⊆) and
(D, v) to each other. They are usually both denoted by (·) . For more clarity
we use ext and int notation.
Definition 2. The extent operator of a pattern setup, denoted by ext, takes
each description d ∈ D to the set of objects in G realizing it. In the case the
pattern setup is a pattern structure, the intent operator, denoted by int, takes
a subset of dobjects A ⊆ G to the largest common description in D covering
them, with     representing the meet in D, we have:
                                                                    l
     ext : D → ℘(G), d 7→ {g ∈ G | d v δ(g)} int : ℘(G) → D, A 7→      δ[A]

The size |ext(d)| is called the support of d and is denoted by support(d).
    In a pattern structure P, the pair of operators (ext, int) forms a Galois
connection between posets (℘(G), ⊆) and (D, v). Thus, ext ◦ int T   and int ◦ ext
form closure operators on the two posets and (ext[D], ⊆) is a -structure
(i.e. a complete lattice). Moreover, another complete lattice isomorphic to poset
(ext[D], ⊆) is used. It is called pattern concept lattice and is denoted by
B(P) = (B(P), ≤). Elements of B(P) are called pattern concepts and are
of the form (A, dA ) with A = ext(dA ) and dA = int(A). Pattern concepts are
ordered as follows: (A, dA ) ≤ (B, dB ) ⇔ A ⊆ B ⇔ dB v dA .

3   Pattern Setups
Whereas basic FCA considers only binary data (descriptions are sets), pattern
structures can handle more complex languages. Nevertheless, it fails at consid-
ering some types of patterns, which do not make a semi-lattice. Before diving
into details, consider the following example.
246     Aimene Belfodil, Sergei O. Kuznetsov, and Mehdi Kaytoue


                     {g1 , g2 , g3 }   {g1 , g2 , g4 }
G δ(·)
                                                         Fig. 2. The table (left) represents
g1 c → a → b                                {g2 , g4 }   the pattern setup considered in ex-
g2 c → b → b → a
                                                         ample 1. The diagram (right) rep-
g3 a                     {g1 }         {g2 }    {g4 }
                                                         resents the poset (ext[D], ⊆) (not
g4 b → b → c
                                        ∅                making a closure system).



Example 1. Let be the dataset with the set of objects G = {g1 , g2 , g3 , g4 } in Fig. 2
(left). The description space (D, v) contains all non empty sequences that can
be built using items in {a, b, c}. It is ordered by the relationship “is substring
of” (i.e. is subsequence of -without gaps-) denoted by v. Such an order does
not form a meet-semilattice. Indeed, consider sequences δ(g1 ) = c → a → b and
δ(g2 ) = c → b → b → a; clearly, their common lower bounds {δ(g1 ), δ(g2 )}` =
{a, b, c} form an antichain and hence {δ(g1 ), δ(g2 )} has no meet. It follows that
(G, D, δ) is a pattern setup but not a pattern structure.
Important Remark. Unless otherwise mentioned, in this paper P = (G, D, δ)
denotes a pattern setup where G is a non-empty finite set. Hence, theorems
and propositions in this paper are guaranteed to be valid only if G is finite.
    It is clear that the extent operator (see Definition 2) can be used in the general
case of pattern setups since it requires only the order. However, the intent of a
set may not exist. We define here the cover operator.
Definition 3. We call cover operator, denoted by cov, the operator that takes
each subset A ⊆ G to the set of common descriptions in D covering them:
         cov : ℘(G) → ℘(D), A 7→ δ[A]` = {d ∈ D | (∀g ∈ A) d v δ(g)}
    Note that ext and cov are order reversing mappings in a pattern setup:
(∀A, B ⊆ G)A ⊆ B ⇒ cov(B) ⊆ cov(A) and (∀c, d ∈ D)c v d ⇒ ext(d) ⊆ ext(c).
    Some sets of objects A ⊆ G can have no common descriptions covering them.
In other words, cov(A) = ∅. Definition 4 develops a categorization of sets A ⊆ G.
Definition 4. Let A be a subset of G. A is said to be an extent iff ∃d ∈ D |
A = ext(d). A is said to be coverable iff ∃d ∈ D : A ⊆ ext(d), otherwise it
is said to be non coverable. The set of all possible extents is denoted by Pext
and given by Pext = ext[D]. The set of coverable sets is given by ↓ Pext .
Example 2. Consider Fig. 2. Set A = {g2 , g4 } has cover cov(A) = {b, b → b, c}.
Moreover, A is an extent since ext(b → b) = A. Set B = {g3 } is not an extent
since the only covering description is “a” for which ext(a) = {g1 , g2 , g3 } (B is
coverable). Set C = {g3 , g4 } is non coverable, since g3 and g4 do not have
common symbols.
Remark 1. As in pattern structures [10], pattern implication and object
implication can be defined thanks to extent and cover operators, respectively.
For c, d ∈ D, the pattern implication c → d holds if ext(c) ⊆ ext(d). Con-
versely, for A, B ⊆ G, the object implication A → B holds if cov(A) ⊆ cov(B).
                                       Pattern Setups and Their Completions         247


3.1   On Maximal Descriptions in a Cover
In pattern structures, intent operator associates the largest common description
to a set of objects. However, cover operator takes a set of objects to the set of all
descriptions covering them. Such a set can be huge and needs to be “compressed”.
Since we have no guarantees of the existence of the largest common description,
a reasonable suggestion would be to consider the maximal ones.
Definition 5. The set of maximal descriptions covering a subset A ⊆ G,
denoted by cov I (A), is given by:

                     cov I (A) = max(cov(A)) = max(δ[A]` )

Remark 2. Maximal descriptions are sometimes called support-closed descrip-
tions [5]. A description d is said to be support-closed if ∀c ∈ D such that for
d v c and c 6= d we have support(c) < support(d) (i.e. ext(c) ( ext(d)).

3.2   On Upper Approximations of a Set
In pattern structures a description d ∈ D covering the set A will certainly
cover ext(int(A)). This observation is important when it comes to search for
positive or negative hypotheses in a labeled dataset [13] or in other words classi-
fication association rules [15]. Indeed, if G + represents the whole set of positive
objects in the dataset G and if we want exactly one rule which covers all positive
instances, the best one (in terms of relevance [11]) will be the rule int(G + ) → +
with confidence |G + |/|ext(int(G + ))|. For pattern structures ext(int(A)) is the
closure of A, in Rough Set Theory [17], ext(int(A)) can be seen as the upper
approximation of A in G. To have a similar notion for pattern setups, we define
what we call upper-approximation extents.
Definition 6. The set of upper-approximation extents of a subset A ⊆ G,
denoted by A, is given by: A = min({E ∈ ext[D] | A ⊆ E}) = min(↑ A∩ext[D]).
Example 3. Extent A = {g2 , g4 } in Fig. 2 has a unique upper-approximation (i.e.
A = {{g2 , g4 }}). Coverable subset B = {g1 , g2 } has multiple upper approxima-
tions B = {{g1 , g2 , g3 }, {g1 , g2 , g4 }}. Non coverable subset C = {g3 , g4 } has no
upper approximations (i.e. C = ∅). Proposition 1 formalizes these observations.
Proposition 1. For any A ⊆ G, we have:

 A ∈ Pext ⇔ A = {A} (1) A ∈↓ Pext ⇔ A 6= ∅ (2) A = min(ext[cov(A)]) (3)

Proof. We demonstrate each property independently:
 1. A ∈ ext[D] ⇔ A = {A}: For (⇐), A ⊆ ext[D], thus A ∈ ext[D]. For (⇒), we
    have A ∈↑ A ∩ ext[D] thus A = min(↑ A ∩ ext[D]) = {A}.
 2. A ∈↓ ext[D] ⇔ A 6= ∅: For (⇒), we have A ∈↓ ext[D], that is ∃B ∈ ext[D]
    s.t. A ⊆ B (i.e B ∈↑ A). Thus ↑ A∩ext[D] is not empty and so does A (since
    ↑ A ∩ ext[D] is finite). For (⇐), we have A 6= ∅, that is ↑ A ∩ ext[D] 6= ∅ thus
    ∃B ∈ ext[D] such that A ⊆ B thus A ∈↓ ext[D] =↓ Pext .
248     Aimene Belfodil, Sergei O. Kuznetsov, and Mehdi Kaytoue


 3. We show ext[cov(A)] =↑ A ∩ ext[D]. Let B ⊆ G. We have B ∈ ext[cov(A)]
    ⇔ ∃d ∈ cov(A) s.t. B = ext(d) ⇔ ∃d ∈ D ∀g ∈ A : d v δ(g) ⇔ ∃d ∈ D
    A ⊆ ext(d) = B ⇔ B ∈ ext[D] ∩ ↑ A. This concludes the proof.       t
                                                                       u

   For a setTA ⊆ G s.t. A 6= ∅, any description covering A contains at least
elements of A, the intersection of upper approximations of A.
                                        T       T
Proposition 2. I : ℘(G) → ℘(G), A 7→ A = {E ∈ ext[D] | A ⊆ E} is a
closure operator on (℘(G), ⊆). Moreover, we have I(A) = A for any A ⊆ G.

Proof. In fact, this proposition is a direct application of the following Lemma.
                                                    V
Lemma 1. Let (P, ≤) be a complete lattice with V its meet and let E ⊆ P be a
subset. The mapping φE : P → P, p 7→ φE (p) = {e ∈ E | p ≤ e} is a closure
operator on (P, ≤) and φE [P ] is a meet-structure in poset (P, ≤).
We prove Lemma 1. (1) φE is extensive. Trivially p ∈ {e ∈ E | p ≤ e}` for
p ∈ P . Since the meet is the greatest element, we conclude: p ≤ φE (p). (2) φE
is monotone. Let p1 , p2 ∈ P s.t. p1 ≤ p2 , we have {e ∈ E | p2 ≤ e} ⊆ {e ∈
E | p1 ≤ e} thus {e ∈ E | p1 ≤ e}` ⊆ {e ∈ E | p2 ≤ e}` . We conclude that
φE (p1 ) ≤ φE (p2 ). (3) φE is idempotent. Let us show {e ∈ E | φE (p) ≤ e} =
{e ∈ E | p ≤ e}. Inclusion ⊆ is verified since p ≤ φE (p). Inclusion ⊇ comes from
the definition since φE (p) is a lower bound of {e ∈ E | p ≤ e}, then for any
e ∈ E such that p ≤ e we have φE (p) ≤ e. The idempotence is straightforward.
    Proposition 2 is a corollary of Lemma 1 since (℘(G), ⊆) is a complete lattice
in which the meet is the set intersection. Indeed, I , φext[D] is a closure operator
on (℘(G), ⊆). I(A) = A comes directly from the previous idempotence proof. t      u

Proposition 3. For any g ∈ G we have {g} = {ext(δ(g))}. That is, any single-
ton set {g} ⊆ G is coverable and has a unique upper-approximation.

Proof. Let g ∈ G, we have δ(g) ∈ cov({g}), thus ext(δ(g)) ∈ ext[cov({g})]. Let
us show that ext(δ(g)) is a lower bound of ext[cov({g})]. We have by definition:
cov({g}) = {d ∈ D | d v δ(g)}. Thus, ∀d ∈ cov({g}) : d v δ(g). Since ext is an
order reversing operator, we obtain: ∀A ∈ ext[cov({g})] : ext(δ(g)) ⊆ A. Thus
ext(δ(g)) is the smallest element of ext[cov({g})]. That is {g} = {ext(δ(g))}. t
                                                                               u

3.3   Linking Upper Approximations and Maximal Descriptions
Now that we have both upper approximations and maximal covering descrip-
tions, a judicious question would be:
Question 1. What is the relationship between cov I (A) and A for A ⊆ G?
Before diving into more details, consider the example below.

Example 4. Consider Fig. 2. Extent A = {g2 , g4 } has cov(A) = {b, b → b, c}.
Hence, cov I (A) = {b → b, c}. Therefore, ext[cov I (A)] = {{g2 , g4 }, {g1 , g2 , g4 }}.
Besides, A = {{g2 , g4 }}. Thus, counter-intuitively, A is not equal to ext[cov I (A)].
                                     Pattern Setups and Their Completions        249


    Furthermore, when D is an infinite poset, the set cov I (A) might not “hold”
all the information contained in cov(A). That is to say: “knowing only maximal
covering descriptions does not allow us to deduce the set of covering ones”.
Speaking formally, cov I (A) does not necessarily verify cov(A) =↓ cov I (A).


4   Pattern Hyper-Structures and Completions
There is a standard construction how an ordered set of descriptions can be
turned in a semilattice of descriptions. For example, see [13], where a semilattice
on sets of graphs with labeled vertices and edges is constructed from the order
given by subgraph isomorphism relation. Let D = (D, v) be a poset and let
A(D) be the set of its antichains. It is possible to define a partial order by
letting S1 , S2 ∈ A(D) [4]: S1 ≤ S2 iff ∀s1 ∈ S1 ∃s2 ∈ S2 : s1 v s2 (i.e. S1 ⊆↓ S2 ).
Please note that this relation does not define an order on ℘(D). It does define
only a pre-order since anti-symmetry does not hold (see [8]).
    In fact, when D is finite, (A(D), ≤) is a distributive lattice where the meet
and the join are given by S1 ∧ S2 = max(↓ S1 ∩ ↓ S2 ) and S1 ∨ S2 = max(S1 ∪
S2 ), respectively. Thus, one can build a pattern structure with (A(D), ≤) which
embeds the pattern setup P = (G, D, δ) for any finite set of objects G. We call
such a pattern structure the antichain completion of P.

Definition 7. Let P = (G, D, δ) be a pattern setup, the antichain comple-
tion of P is the pattern setup denoted by PO and given by:

                      PO = (G, (A(D), ≤), σ : g 7→ {δ(g)})

Consider the following question for the more general case where D is infinite:
Question 2. What is a necessary and sufficient condition on P that makes PO
a pattern structure?
We have seen that the finiteness of D is a sufficient condition (i.e. (A(D), ≤) is
a distributive lattice), but not a necessary one. Definition 1 requires for PO that
for any A ⊆ G we have σ[A] = {{δ(g)} | g ∈ A} has a meet in (A(D), ≤). This
condition is verified iff pattern setup P satisfies the following condition:

            ∀A ⊆ G : cov(A) = δ[A]` = ↓ max(δ[A]` ) =↓ cov I (A)                 (1)

and then cov I (A) is the meet of σ[A] in (A(D), ≤) (see Theorem 2).
Definition 8. A pattern setup (G, D, δ) is said to be a pattern hyper-structure
if condition (1) holds.
   Note that the term hyper in pattern hyper-structure comes from the
notion of hyper-lattices briefly introduced in [19]. Note also that graphs ordered
by subgraph isomorphism relation [13] induces a pattern hyper-structure but not
a pattern structure. Same remark holds for sequential patterns [6,7] under the
assumption of the existence of the largest sequence 1 subsumed by all sequences.
250     Aimene Belfodil, Sergei O. Kuznetsov, and Mehdi Kaytoue


Remark 3. When D is infinite, poset (A(D), ≤) remains to be a join-semilattice
but not necessarily has finite meets (as it is when D is finite). It was shown in
[4] that a necessary and sufficient condition on (A(D), ≤) to have finite
meets is as follows:

                 ∀S1 , S2 ∈ A(D) ∃S ∈ A(D) :↓ S1 ∩ ↓ S2 =↓ S                   (2)

In this case, S represents the meet (infimum) of {S1 , S2 } in A(D); moreover,
(A(D), ≤) becomes a distributive lattice. Requiring (A(D), ≤) to have a meet is
a sufficient condition, but still not necessary for the antichain completion
to be a pattern structure. In fact, condition (2) implies condition (1) (case G
finite). Moreover, an example from [4] shows that condition (2) does not always
hold even if (D, v) is a meet-semilattice. Example 5 shows that condition (1)
does not always hold.

Example 5. Consider the pattern setup (G, (D, ⊇), δ) with G = {g1 , g2 }, D =
{[a, b] ⊆ R | a, b ∈ R}\[1, 3], δ(g1 ) = {1} and δ(g2 ) = {3}. The considered
pattern setup does not verify condition (1) since δ[G]` is clearly not empty while
max(δ[G]` ]) = ∅. (G, (D, ⊇), δ) is thus not a pattern hyper-structure.

Theorem 1. For any pattern hyper-structure (G, D, δ), we have:
                                               
                  ∀A ⊆ G : A = min ext cov I (A)

Proof. Let A ∈ ℘(G). If cov(A) = ∅, then the property trivially holds since
cov I (A) = cov(A) = ∅. Let A ∈↓ ext[D] (i.e. cov(A) 6= ∅). We need to show the
following equivalent proposition min(ext[cov(A)]) = min(ext[max(cov(A))]).
    We start by showing min(ext[cov(A)]) ⊆ min(ext[max(cov(A))]). Let B ∈
min(ext[cov(A)]). Since B ∈ ext[cov(A)], ∃d ∈ cov(A) s.t. B = ext(d). Since
cov(A) =↓ max(cov(A)) ((G, D, δ) is a pattern hyper-structure), there exists c ∈
max(cov(A)) s.t. d v c. Since extent is an order revering mapping, C = ext(c) ⊆
ext(d) = B. Supposing that C ∈ ext[max(cov(A))] s.t. C ( B contradicts the
fact that B ∈ min(ext[cov(A)]) since C ( B and C ∈ ext[cov(A)]. It follows that
C = B ∈ ext[max(cov(A))]. Again, supposing that B 6∈ min(ext[max(cov(A)])
leads to a contradiction (∃D ∈ ext[max(cov(A)] ⊆ ext[cov(A)] s.t. D ( B while
B ∈ min(ext[cov(A)])). We conclude that B ∈ min(ext[max(cov(A))]).
    It remains to show that min(ext[cov(A)]) ⊇ min(ext[max(cov(A))]). Sup-
pose the converse: ∃E ∈ min(ext[max(cov(A))]) such that E 6∈ min(ext[cov(A)]),
we have E ∈ ext[max(cov(A))] ⊆ ext[cov(A)]. Since E 6∈ min(ext[cov(A)]) and
ext[cov(A)] is finite, we obtain that ∃F ∈ min(ext[cov(A)]) such that F ( E.
The first inclusion implies F ∈ min(ext[max(cov(A))]). This is a contradiction,
since at the same time F ( E and E ∈ min(ext[max(cov(A))]).
    We conclude that A = min(ext[cov(A)]) = min(ext[cov I (A)]).               t
                                                                               u

   Theorem 1 answers question 1. It says that in a pattern hyper-structure
rather than considering all covering descriptions to compute A, it is sufficient to
consider only maximal covering ones. Theorem 2 answers question 2:
                                     Pattern Setups and Their Completions      251


Theorem 2. Let P = (G, D, δ) be a pattern setup, the antichain completion
of P is a pattern structure iff P is a pattern hyper-structure. Operators
extO and intO denote extent and intent of PO and are given by
                              \
       (∀S ∈ A(D)) extO (S) =    ext[S] (∀A ⊆ G) intO (A) = cov I (A)
                          T
Moreover, we have PO                                       O
                   ext = { S | S ⊆ Pext }. Note that G ∈ Pext .

Proof. Let us show that if P is a pattern hyper-structure then PO is a pat-
tern structure. PO is a pattern structure iff every subset of σ[G] has a meet in
(A(D), ≤). For A ⊆ G we have σ[A]` = {S ∈ A(D) | (∀g ∈ A) S ⊆↓ δ(g)} =
{S ∈ A(D) | S ⊆ δ[A]` } where δ[A]` and σ[A]` denote respectively the lower
bounds T of δ[A] w.r.t. v and the lower bounds of of σ[A] w.r.t. ≤ (recall that
δ[A]` = g∈A ↓ δ(g)). In this proof ↓ refers to the down-closure related to v.
  – (⇒) Let A ⊆ G : δ[A]` =↓ max(δ[A]` ). Thus σ[A]` = {S ∈ A(D) | S ⊆↓
    max(δ[A]` )} = {S ∈ A(D) | S ≤ max(δ[A]` )}. Since max(δ[A]` ) ∈ A(D),
    so max(δ[A]` ) is the meet of σ[A] in A(D).
  – (⇐) PO is pattern structure is equivalent to say: ∀A ⊆ G, σ[A] has a meet
    M ∈ A(D). That is, for A ⊆ G: ∀S ∈ A(D) : S ⊆ δ[A]` ⇔ S ⊆↓ M .
    Particularly, for S = {d} with d ∈ D, we deduce that: ∀d ∈ δ[A]` : d ∈↓ M .
    Thus, δ[A]` ⊆↓ M . Moreover, since M ⊆ δ[A]` (M ∈ σ[A]` ) and ↓ is a
    closure operator on (℘(D), ⊆) we have by monotony ↓ M ⊆ δ[A]` ⊆↓ M . We
    conclude that δ[A]` =↓ M (note that ↓ δ[A]` = δ[A]` ). It follows that M =
    max(δ[A]` ) that is δ[A]` =↓ max(δ[A]` ). This concludes the equivalence.
    Let us now define intO and extO . The previous proof shown that for A ⊆ G
we have intO (A) = max(δ[A]` ) = cov I (A) (The meet of σ[A] is max(δ[A]` )).
For extO operator, let S ∈ A(D). We have: extO (S)T= {g ∈ G | ST≤ σ(g)} =
{g ∈ G | S ⊆↓ δ(g)} = {g ∈ GT| (∀d ∈ S) d v δ(g)} = d∈S ext(d) = ext[S].
    Let usTshow that PO                                             O
                        ext = { S | S ⊆ Pext }. By definition of ext , the property
 O
Pext ⊆ { S | S ⊆ Pext } holds. For the inverse inclusion, it is sufficient to show
that Pext ⊆ PO                O
               ext (since (Pext , ⊆) is closed under intersection). Let A ∈ Pext .
∃d ∈ D s.t. A = ext(d). Since {d} ∈TA(D), and extO ({d}) = ext(d) = A. We
conclude that A ∈ PO            O
                      ext and Pext = { S | S ⊆ Pext }.                            t
                                                                                  u
   There is a completion that transforms any pattern setup to a pattern struc-
ture. This completion relies on the Dedekind-MacNeille completion.
Definition
 T         9. The family of subsets of D given by DM (D) = {A` | A ⊆ D} is
a -structure and is called the Dedekind-MacNeille Completion of D.
Theorem 3. Let P = (G, D, δ) be a pattern setup. The Direct Completion
of P is the pattern structure denoted by PH and given by:
                     PH = (G, (DM (D), ⊆), γ : g 7→ ↓ δ(g))
Operators extH and intH denote extent and intent of PH and are given by
                            \
  (∀S ∈ DM (D)) extH (S) =    ext[S] (∀A ⊆ G) intH (A) = cov(A) = δ[A]`
                           T
Moreover, we have PHext = { S | S ⊆ Pext }.
252     Aimene Belfodil, Sergei O. Kuznetsov, and Mehdi Kaytoue


                               ({g1 , g2 , g3 , g4 }, ∅)

            ({g1 , g2 , g3 }, {a})                ({g1 , g2 , g4 }, {b, c})
                                                                              Fig. 3.
            ({g1 , g2 }, {a, b, c})               ({g2 , g4 }, {b → b, c}     Concept
                                                                              lattice
({g1 }, {c → a → b}) ({g2 }, {c → b → b → a}) ({g4 }, {b → b → c})            B(PO ).

                                      (∅, {1})


Proof. According to definition 9, (DM (D), ⊆) is a complete lattice closed under
intersection. Thus, the pattern setup PHTis a pattern structure.
                                H
                   T have int (A) = g∈A ↓ δ(g)
    By definition, we                                      = δ[A]` = cov(A) since
               H                                      H
the meet in P is . For the extent operator ext . Let S ∈ DM         T (D), we have
extH (S) = {g ∈ G | S ⊆↓ δ(g)} = {g ∈ G | (∀d ∈ S) d v δ(g)} = ext[S].
                               T
    Let us show
              T  that PH                                          H
                        ext = { S | S ⊆ Pext }. Thanks to ext definition, prop-
       O
erty Pext ⊆ { S | S ⊆ Pext } holds. For the inverse inclusion, it is sufficient to
show that Pext ⊆ PH             H
                   ext (since (Pext , ⊆) is closed under intersection). Let A ∈ Pext ,
∃d ∈ D such that A = ext(d). We have extH ({d}` ) = extH (↓ d) = {g ∈ G |↓
                                                                              H
         T = {g ∈ G | d v δ(g)} = ext(d) = A. We conclude that A ∈ Pext and
d ⊆↓ δ(g)}
  H
Pext = { S | S ⊆ Pext }, which completes the proof.                                t
                                                                                   u
Example 6. Fig. 3 depicts the concept lattice B(PO ) associated to the antichain
completion of the pattern hyper-structure P considered in Fig. 2 (i.e., the de-
scription space is augmented with the top element 1). One can see that there
are two new (underlined) extents {g1 , g2 } and {g1 , g2 , g3 , g4 } in PO ext \Pext . For
instance, consider the intent of {g1 , g2 } in the completion, each pattern d has
extent ext(d) ) {g1 , g2 }. Extent {g1 , g2 , g3 , g4 } is non coverable in P and thus
intO ({g1 , g2 , g3 , g4 }) = max(cov({g1 , g2 , g3 , g4 })) = max(∅) = ∅.


5     Conclusion
In this paper, we have developed a better understanding of pattern setups, a
framework that models pattern spaces relying only on a poset. Next, we studied
the usual transformation of pattern setups to pattern structures using antichains.
We have shown that such a completion does not always produce a pattern struc-
ture unless the pattern setup is a pattern hyper-structure. Finally, we have shown
that a natural completion of a pattern setup to a pattern structure exists thanks
to the Dedekind-MacNeille completion. This work paves the way to answer an
important question: How to enumerate extents of a pattern setup without “visit-
ing” the whole set of extents of its associated completion to a pattern structure?

Acknowledgments
This work has been partially supported the Association Nationale Recherche
Technologie (ANRT) French program. Sections 1, 2 and 3.1 were written by
                                       Pattern Setups and Their Completions          253


Sergei O. Kuznetsov supported by the Russian Science Foundation under grant
17-11-01276 and performed at St. Petersburg Department of Steklov Mathemat-
ical Institute of Russian Academy of Sciences, Russia.


References
 1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Data Engineering. pp.
    3–14. IEEE (1995)
 2. Baixeries, J., Kaytoue, M., Napoli, A.: Characterizing functional dependencies in
    formal concept analysis with pattern structures. Ann. Math. Artif. Intell. 72(1-2),
    129–149 (2014)
 3. Belfodil, A., Kuznetsov, S.O., Robardet, C., Kaytoue, M.: Mining convex polygon
    patterns with formal concept analysis. In: IJCAI. pp. 1425–1432 (2017)
 4. Boldi, P., Vigna, S.: On the lattice of antichains of finite intervals. CoRR
    abs/1510.03675 (2016), https://arxiv.org/abs/1510.03675
 5. Boley, M., Horváth, T., Poigné, A., Wrobel, S.: Listing closed sets of strongly ac-
    cessible set systems with applications to data mining. Theor. Comput. Sci. 411(3),
    691–700 (2010)
 6. Buzmakov, A., Egho, E., Jay, N., Kuznetsov, S.O., Napoli, A., Raı̈ssi, C.: On
    mining complex sequential data by means of FCA and pattern structures. Int. J.
    General Systems 45(2), 135–159 (2016)
 7. Codocedo, V., Bosc, G., Kaytoue, M., Boulicaut, J., Napoli, A.: A proposition for
    sequence mining using pattern structures. In: ICFCA. LNCS (10308). 106–121.
    (2017)
 8. Crampton, J., Loizou, G.: The completion of a poset in a lattice of antichains.
    International Mathematical Journal 1(3), 223–238 (2001)
 9. Ganter, B., Wille, R.: Formal Concept Analysis. Springer (1999)
10. Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: ICCS.
    LNCS (2120). 129–142. (2001)
11. Garriga, G.C., Kralj, P., Lavrac, N.: Closed sets for labeled data. Journal of Ma-
    chine Learning Research 9, 559–580 (2008)
12. Kaytoue, M., Kuznetsov, S.O., Napoli, A.: Revisiting Numerical Pattern Mining
    with Formal Concept Analysis. In: IJCAI. pp. 1342–1347 (2011)
13. Kuznetsov, S.O.: Learning of simple conceptual graphs from positive and negative
    examples. In: PKDD. LNCS (1704). 384–391., vol. 1704, pp. 384–391 (1999)
14. Kuznetsov, S.O.: Pattern structures for analyzing complex data. In: RSFDGrC.
    LNCS (5908). 33–44. (2009)
15. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining.
    In: KDD. pp. 80–86 (1998)
16. Lumpe, L., Schmidt, S.E.: Pattern structures and their morphisms. In: CLA. CEUR
    Workshop Proceedings, vol. 1466, pp. 171–179 (2015)
17. Pawlak, Z.: Rough sets. International Journal of Parallel Programming 11(5), 341–
    356 (1982)
18. Roman, S.: Lattices and Ordered Sets. Springer New York (2008)
19. Zaki, M.J.: SPADE: an efficient algorithm for mining frequent sequences. Machine
    Learning 42(1/2), 31–60 (2001)