Generalized decision directed acyclic graphs and their
connection with Formal Concept Analysis
Šimon Horvát1,∗ , L’ubomír Antoni1 , Ondrej Krídlo1 , Alexander Szabari1 and
Stanislav Krajči1
1
    Institute of Computer Science, Faculty of Science, Pavol Jozef Šafárik University in Košice, 041 80 Košice, Slovakia


                                         Abstract
                                         We propose a generalization of decision jungles from a binary decision directed acyclic graph to a
                                         generalized decision directed acyclic graph. We describe the classification method based on a generalized
                                         decision directed acyclic graph. Moreover, we explore the properties of our proposed method and
                                         illustrate it by example. We present our experiments on several datasets and provide the comparison of
                                         our results with related studies. The comparison of the generalized decision directed acyclic graph with
                                         the concept lattice of approval for a loan concludes our paper.

                                         Keywords
                                         Classification task, Decision directed acyclic graphs, Concept lattice, Generalization


1. Introduction
Decision jungles (Shotton et al. [30]) provide the compact ensemble method for classification
tasks which extend the popular classification methods of decision trees [27, 28, 29, 4] and
random forests [5, 6]. In particular, decision jungles represented by the ensemble of decision
directed acyclic graphs require less memory in comparison with binary decision trees.
   Decision directed acyclic graphs for classification tasks were explored in several recent
studies [8, 17, 16, 21, 22, 23, 24, 26] including their connections with operation of convolution.
Laptev and Buhmann [21] proposed a novel classification method for images, which is based
on so-called transformation-invariant convolutional jungles. Ioannou et al. [17] explored the
connections between decision jungles and convolutional neural networks. They proposed the
hybrid conditional network which is based on trees and convolutional neural networks. Ignatov
and Ignatov [16] proposed an algorithm of decision streams given by tree structure which is
generated by merging nodes from various branches based on their two-sample test statistics
similarity.
   Moreover, the connections between trees structures and Formal concept analysis were recently
investigated in [20, 15, 2, 12, 19]. Kuznetsov [20] described the model of learning from positive
Published in Pablo Cordero, Ondrej Kridlo (Eds.): The 16𝑡ℎ International Conference on Concept Lattices and Their
Applications, CLA 2022, Tallinn, Estonia, June 20–22, 2022, Proceedings, pp. 57–68.
∗
    Corresponding author.
Envelope-Open simon.horvat@upjs.sk (Š. Horvát); lubomir.antoni@upjs.sk (L. Antoni); ondrej.kridlo@upjs.sk (O. Krídlo);
alexander.szabari@upjs.sk (A. Szabari); stanislav.krajci@upjs.sk (S. Krajči)
Orcid 0000-0002-3191-8469 (Š. Horvát); 00000000-0002-7526-8146 (L. Antoni); 0000-0001-8166-6901 (O. Krídlo);
0000-0001-5372-1096 (A. Szabari); 0000-0001-5612-3534 (S. Krajči)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
and negative examples in the terms of Formal concept analysis. Guillas et al. [15] presented
the first structural links between dichotomic lattices and decision trees. Bělohlávek et al. [2]
proposed a novel method of induction of decision trees. The concept lattice is seen as the
collection of overlapping decision trees in this approach. Links between random forests and
pattern structures in Formal Concept Analysis were investigated by Krause et al. [19], as well.
Dudyrev and Kuznetsov [12] proposed a combination of concept lattices and decision trees.
   In this paper, we propose a generalization of decision jungles from the binary decision directed
acyclic graphs to the generalized decision directed acyclic graphs. In Section 2, we present
the basic notions of classifications tasks for categorical attributes. Our novel method for the
construction of a classifier for a generalized oriented acyclic graph is proposed in Section 3.
We present the experiments on our novel method and a comparison of our results with other
related studies in Section 4. Connections of our approach with Formal Concept Analysis are
illustrated in Section 5.


2. Classification tasks with nominal attributes
In this section, we will formally describe the basic notions of classification tasks with nominal
(categorical) input attributes.
   Let 𝐴 be a set and let 𝑏 ∉ 𝐴. Consider a set 𝐿𝑎 for each 𝑎 ∈ 𝐴 and a set 𝐿𝑏 for 𝑏 ∉ 𝐴. Then,
∏𝑎∈𝐴 𝐿𝑎 denotes a set of all functions 𝑓 with a domain 𝐴 such that 𝑓 (𝑎) ∈ 𝐿𝑎 for all 𝑎 ∈ 𝐴.

Table 1
Example of training set for loan approval

                                                                          Loan
              No.    Income      Savings    Self-Employed     Married     Approved
              1      high        high       no                no          1
              2      high        high       yes               no          1
              3      high        high       yes               yes         1
              4      high        high       yes               no          1
              5      low         high       no                yes         1
              6      low         high       no                yes         1
              7      low         high       no                yes         1
              8      high        low        no                yes         1
              9      high        low        yes               yes         1
              10     high        low        yes               no          0
              11     low         high       no                no          0
              12     low         low        yes               yes         0
              13     low         low        yes               no          0
              14     low         low        yes               yes         0
              15     low         high       yes               no          0
              16     average     low        no                no          0


   For classification tasks with nominal input attributes, the set 𝐴 represents the set of input
attributes and 𝑏 is a target attribute. The system of sets (𝐿𝑎 ∶ 𝑎 ∈ 𝐴) represents the unordered
sets of values (or so-called classes, or categories) of input attributes. The set 𝐿𝑏 is an unordered
set of values (or so-called classes, or categories) of the target attribute. The number of values in
𝐿𝑏 is usually low in classification tasks.
   In Table 1, the example for 𝐴 = {Income, Savings, Self-employed, Married} is shown. For
𝑏 ∉ 𝐴, the target attribute of loan approval is assigned, whereby the sets 𝐿Income =
{low, average, high}, 𝐿Savings = {low, high}, 𝐿Self-employed = {yes, no}, 𝐿Married = {yes, no}, and
𝐿LoanApproved = {1, 0} are considered.
   For 𝑘 ∈ ℕ, a set 𝑆 given by
                              𝑆 = {(𝑥1 , 𝑦 1 ), … , (𝑥𝑘 , 𝑦 𝑘 )} ⊆ ∏ 𝐿𝑎 × 𝐿𝑏
                                                                 𝑎∈𝐴
is called the training set with 𝑘 objects (also called the labeled objects). In our example from
Table 1, we have 𝑘 = 16. In some cases, it can hold that 𝑥𝑖 = 𝑥𝑗 for some 𝑖, 𝑗 ∈ {1, 2, … , 𝑘}, 𝑖 ≠ 𝑗,
but 𝑦 𝑖 ≠ 𝑦 𝑗 . It means that we have two people with the same values of input attributes. However,
both have different values of the target attribute. This is not the case in our example.
   For 𝑚 ∈ ℕ such that 𝑘 < 𝑚, a set 𝑇 given by
                                     𝑇 = {𝑥𝑘+1 , … , 𝑥𝑚 } ⊆ ∏ 𝐿𝑎
                                                                𝑎∈𝐴
is called the testing set with 𝑚 − 𝑘 objects (also called the unlabeled objects). In Table 2, we
illustrate the possible testing set for our example of loan approval.

Table 2
Example of testing set for loan approval

                        Income       Savings      Self-Employed        Married
                        high         low          no                   no
                        low          high         yes                  yes


  A function
                                               𝑐 ∶ 𝑇 → 𝐿𝑏
is called a classifier. For its possible fuzzy modification, we can consider the function
                                                            𝐿
                                           𝑐 ∶ 𝑇 → [0, 1] 𝑏 .
  In the following sections, we will present the construction of a novel classifier based on
generalized decision acyclic graphs and describe the possible connections of this classifier with
concept lattice in Formal Concept Analysis.


3. Generalized decision directed acyclic graph
for classification tasks
Shotton at al. [30] and Pohlen [26] provide the ensemble of classifiers that is based on the rooted
binary decision directed acyclic graphs [24]. A rooted binary decision directed acyclic graph is
a directed graph with one root node (i.e., in-degree 0), multiple split nodes (in-degree at least 1,
out-degree 2), and multiple leaf nodes (in-degree at least 1, out-degree 0).
   For constructing a classifier based on a decision directed acyclic graph, Shotton et al. [30]
optimize the parameters of the attribute for split node, the numerical threshold for split node,
left child of a split node, and right child node of a split node for root and each internal node of the
decision directed acyclic graph. They optimize these parameters by LSearch or ClusterSearch
algorithms regarding the numerical attributes. Pohlen [26] noted that an attribute map can be
used to process the categorical data in this environment, as well. However, it can lead to the
growing depth of a binary decision directed acyclic graph.
   These approaches inspire us to propose the generalized version of decision directed acyclic
graph to process the categorical data without attribute transformation. Moreover, we aim to
eliminate the growing depth of a binary decision directed acyclic graph by a generalized decision
directed acyclic graph. Finally, we will compare the performance and memory efficiency of our
proposed solution with those from [30, 26].

Table 3
Comparison of two types of directed acyclic graphs

                            Rooted binary directed           Rooted generalized directed
                            acyclic graph [30, 26]           acyclic graph (in our approach)

          Type of node      in-degree        out-degree      in-degree        out-degree
          root              0                2               0                ≥2
          internal node     ≥1               2               ≥1               ≥2
          leaf node         ≥1               0               ≥1               0


  First, the definition of a rooted generalized directed acyclic graph follows.
Definition 1. A rooted generalized directed acyclic graph is a directed graph 𝐺 = (𝑉 , 𝐸) such
that:
  (1) A unique root node 𝑟 ∈ 𝑉 has zero incoming edges and at least two outgoing edges;
  (2) There exist no directed cycles in 𝐺;
  (3) There exists a directed path from a root node 𝑟 ∈ 𝑉 to each other node 𝑣 ∈ 𝑉;
  (4) Each node 𝑣 ∈ 𝑉 ⧵ {𝑟} has either zero outgoing edges (then it is called a leaf) or has at least
      two outgoing edges (then it is called an internal node), and at least one incoming edge.
  The comparison of a rooted generalized directed acyclic graph with a rooted binary directed
acyclic graph applied by [26, 30] is shown in Table 3.
  In the following part, we will define a generalized decision directed acyclic graph.
Definition 2. Let 𝐴 be a set of 𝑛 categorical attributes and let 𝑏 ∉ 𝐴 is a target attribute. A
generalized decision directed acyclic graph is a rooted generalized directed acyclic graph 𝐺 = (𝑉 , 𝐸)
such that

  (1) All directed paths from the root node 𝑟 to a node 𝑣 have the same lengths;
  (2) The triple (𝑑𝑣 , 𝑏𝑣 , 𝑈𝑣 ) is assigned for the root node and for each internal node 𝑣 ∈ 𝑉, whereby
        (a) 𝑑𝑣 ∈ {1, 2, … , 𝑛} is an index of the attribute,
        (b) 𝑏𝑣 ∈ {1, 2, … , |𝐿𝑏 |} is a class index of target attribute 𝑏,
        (c) 𝑈𝑣 = {𝑢 ∈ 𝑉 ∶ (𝑣, 𝑢) ∈ 𝐸} is a set of children nodes of 𝑣 such that |𝑈𝑣 | ≤ |𝐿𝑎𝑑 | and
                                                                                            𝑣
            𝑔𝑣 ∶ 𝑈𝑣 → 𝐿𝑎𝑑 is an injective mapping;
                           𝑣

  (3) A class index 𝑏𝑤 ∈ {1, 2, … , |𝐿𝑏 |} of the target attribute 𝑏 is assigned for each leaf node 𝑤 ∈ 𝑉.

   In the following subsection, we will describe the algorithm for the generation of the general-
ized decision directed acyclic graph from a given training set 𝑆 described in Section 2.

3.1. Algorithm for learning a generalized decision directed acyclic graph
Suppose that we have a training set 𝑆 and we aim to learn a generalized decision directed acyclic
graph from 𝑆. By Definition 2, we aim to find a triple (𝑑𝑣 , 𝑏𝑣 , 𝑈𝑣 ) for the root node and each
internal node 𝑣 ∈ 𝑉. Moreover, we aim to find a class index 𝑏𝑤 for each leaf node 𝑤 ∈ 𝑉.
   In Algorithm 1, we describe our proposed procedure for learning a generalized decision
directed acyclic graph from a training set 𝑆. Due to a limited number of pages, we will explain
only the substantial parts of our algorithm in this paper. Additional important notions, the
formal descriptions of mappings and their properties will be included in the extended version
of our paper.
   Without loss of generality, we can suppose that 𝐴 is a set of attributes, 𝑏 is a target attribute,
and 𝑃𝑖 is a set of all parent nodes at some fixed level 𝑖 ∈ 𝐼 (see Figure 1). Let 𝑈𝑝 be a set of all
child nodes for all 𝑝 ∈ 𝑃𝑖 .
   Let 𝑆𝑢 denote the set of all labeled objects from a training set 𝑆 that reaches a node 𝑢 ∈ 𝑈𝑝 .
The procedure AttributeIndexOfNode assigns the attribute index 𝑑𝑢 to 𝑢 (last but one line
of Figure 1). This assignment is based on the entropy function ℎ with a discrete probability
distribution 𝒯 𝑆𝑢 . Note that for 𝑢 ∈ 𝑈𝑝 and 𝑧 ∈ 𝐿𝑎𝑑 , a set
                                                      𝑢

                                     𝑆𝑢𝑧 = {(𝑥, 𝑦) ∈ 𝑆𝑢 ∶ 𝑥𝑑𝑢 = 𝑧}

is a subset of a training set 𝑆 corresponding to a given child node 𝑢 and an attribute value 𝑧
(see Algorithm 1). The procedure TargetClassIndexOfNode assigns the class index 𝑦𝑏𝑢 of the
target attribute 𝑏 to 𝑢 (last line of Figure 1). The class index of the target attribute with the
maximal number of objects in 𝑆𝑢 is selected. If there exist two or more classes with the same
maximal number, we consider the class with the least index.
   Finally, the parent nodes 𝑃𝑖+1 for level 𝑖 + 1 are obtained from a system (𝑈𝑝 ∶ 𝑝 ∈ 𝑅𝑖 ) by
procedure MergeChildNodes with the same attribute and class indices at a level 𝑖 (Figure 2).
   We present the full procedure for learning a generalized decision directed acyclic graph in
Algorithm 1. Note that the set of parents nodes that are internal at a level 𝑖 is denoted 𝑅𝑖 .


4. Experiments
We conducted several experiments to provide the performance and memory efficiency of our
proposed method in comparison with other recent studies, mainly with binary decision directed
Figure 1: Procedures AttributeIndexOfNode and TargetClassIndexofNode at level 𝑖


          Pi
                    p1                      p2                              p3                                   p4


          u p1 1   up1 2   up1 3    up2 1        up2 2    up3 1     up3 2        up3 3        up3 4      up4 1        up4 2

  adu      a1       a2      a3       a3           a2       a4        a2           a5           a3         a4           a5
  ybu      y1       y3      y2       y2           y1       y3        y1           y2           y1         y1           y2

Figure 2: Procedure MergeChildNodes at level 𝑖


                    p1                       p2                                  p3                                    p4


   Pi+1
          u p1 1   up1 2    up1 3                 up2 2     up3 1                     up3 3      up3 4         up4 1

  adu      a1       a2       a3                     a2       a4                        a5           a3           a4
  ybu      y1       y3       y2                     y1       y3                        y2           y1           y1


acyclic graphs and decision forests [30, 26]. For comparison of memory consumption of our
method with other studies, we use the estimation of the model sizes which are described in
[30, 26]. The complete table of estimated memory consumption for nodes and leaves of various
structures and approaches (including our approach) is shown in Table 4.
   We tested the performance of our method and its memory efficiency on four datasets (Con-
nect4, Letter Recognition, Shuttle, and USPS) from UCI Machine Learning Repository [10] to
compare our results with [30, 26].
   The final results of random forest, the ensemble of 15 decision directed acyclic graphs, and
our ensemble of 15 generalized decision directed acyclic graphs on memory consumption
 Algorithm 1: Procedure for learning a generalized decision directed acyclic graph
  Data: Set of attributes 𝐴, target 𝑏, training set 𝑆
  Result: Generalized decision directed acyclic graph 𝐺 matching 𝑆
  𝑖 ← 1;                                                             /Level of graph/
  𝑃𝑖 ← {𝑟};                                                /Parent node at level 1/
  𝑑𝑟 ← AttributeIndexOfNode(𝑟);
  𝑏𝑟 ← TargetClassIndexOfNode(𝑟);
  while 𝑃𝑖 ≠ ∅ do
      𝑅𝑖 ← ∅;                                                        /Internal nodes/
      for 𝑝 ∈ 𝑃𝑙 do
          if ℎ(𝒯 𝑆𝑝 ) ≠ 0 then
              𝑠 ← |{𝑧 ∈ 𝐿𝑎𝑑 ∶ 𝑆𝑝𝑧 ≠ ∅}|;                    /Number of child nodes/
                           𝑝
             𝑈𝑝 ← {𝑢𝑝𝑗 ∶ 𝑗 ∈ {1, … , 𝑠}};                                /New child nodes/
             for 𝑢 ∈ 𝑈𝑝 do
                  𝑑𝑢 ← AttributeIndexOfNode(𝑢);
                  𝑏𝑢 ← TargetClassIndexOfNode(𝑢);
             end
             if |{𝑑𝑢 ∶ 𝑢 ∈ 𝑈𝑝 }| = 𝑠 then
                  𝑅𝑖 ← 𝑅𝑖 ∪ {𝑝};
             end
         end
     end
     𝑃𝑖+1 ← MergeChildNodes({𝑈𝑝 ∶ 𝑝 ∈ 𝑅𝑖 });
     𝑖 ← 𝑖 + 1;
  end


and test accuracy are shown in Table 5. For Accuracy, we present the arithmetic mean of
the test accuracies which were obtained by ensembles of 15 classifiers with a random 5-fold
cross-validation.


5. Connections with Formal Concept Analysis
The generalized decision directed acyclic graph for our loan approval example (see Table 1) is
illustrated in Figure 3. Note that the proportion of two numbers inside the nodes corresponds
to the proportion of the count of labeled objects by two classes of target attribute (categories 1
or 0). The sets described in the text on the left side of the nodes in Figure 3 correspond to the
indices of people (first column in Table 1) with category 1 of loan approval.
   Regarding the interesting research results in the area of Formal Concept Analysis [14, 13],
the positive examples from a training set 𝑆 (i.e., the labeled objects with 𝐿𝑏 = 1 in Table 1) can
be considered as (binary) formal context ⟨𝑂, 𝐴× ∪ {⟨𝑏, 1⟩}, 𝑅⟩ such that 1 ∈ 𝐿𝑏 , 𝑅(𝑜, ⟨𝑏, 1⟩) = 1,
and
Table 4
Estimation of memory consumption for each node

                                                                       Size       Total size
               Type of node                               Variable     in bytes   in bytes
                                                          𝑑𝑣           4
                                                          𝜃𝑣           8
               Internal node 𝑣 of tree [5]                𝑙𝑣           4          16

                                                          𝑑𝑣           4
                                                          𝜃𝑣           8
               Internal node 𝑣 of decision                𝑙𝑣           4
               directed acyclic graph [30, 26]            𝑟𝑣           4          20

                                                          𝑑𝑣           4
                                                          𝑢1 ∈ 𝑈 𝑣     4
               Internal node 𝑣 of our generalized         neighbor     4
               decision directed acyclic graph            𝑔𝑣 (𝑢1 )     4          16

               Leaf 𝑣 (all structures)                                            4.|𝐿𝑏 |


Table 5
Comparison of methods on different datasets

                                                         Ensemble of         Ensemble of
                                         Random          15 decision         15 generalized decision
                                         forest [26]     directed acyclic    directed acyclic
    Dataset         Indicator            (15 trees)      graphs [26, 30]     graphs (our paper)
    Connect4        Accuracy             81.5%           82.0%               82.1%
                    Memory size          4.75 MB         1.51 MB             0.22 MB

    Letter          Accuracy             95.6%           95.7%               96.3%
    Recognition     Memory size          4.89 MB         1.50 MB             0.21 MB

    Shuttle         Accuracy             99.9%           99.9%               99.9%
                    Memory size          0.03 MB         0.11 MB             0.005 MB

    USPS            Accuracy             96.0%           96.0%               90.0%
                    Memory size          0.36 MB         0.30 MB             0.035 MB


                                             𝐴× = ⋃ ({𝑎} × 𝐿𝑎 ).
                                                   𝑎∈𝐴
   Note that a conceptual scale [14] or several fuzzy extensions in Formal Concept Analysis
[1, 3, 7, 9, 11, 18, 25] can be also applied here.
Figure 3: Generalized decision directed acyclic graph of approval for loan

                                            {1, 2, . . . , 9} 9 : 7


                                             high               low                      average
                                           income             income                     income


                      {1, 2, 3, 4, 8, 9}   6:1                  {5, 6, 7} 3 : 5               ∅ 0:1


                          high                 low                       not self-   self-employed
                         savings             savings                   employed


            {1, 2, 3, 4} 4 : 0               {5, 6, . . . , 9} 5 : 2                     ∅ 0:4


                                           married                         not married


                          {5, 6, . . . , 9} 5 : 0                          ∅ 0:2


   In our example from Table 1, the set of objects 𝑂 contains people with loan approval
equals 1 (rows 1 – 9), and the set of attributes includes pairs ⟨income, high⟩, ⟨income, low⟩,
⟨savings, high⟩, ⟨savings, low⟩, ⟨self-employed, yes⟩, ⟨self-employed, no⟩, ⟨married, yes⟩,
⟨married, no⟩, ⟨loan approved, 1⟩. The corresponding concept lattice of loan approval is shown
in Figure 4.
   The circles of concept lattice filled with black color (Figure 4) correspond to the nodes of
generalized decision directed acyclic graph (Figure 3) with the most of node people who are
approved for a loan. For example, the extent of formal concept {1, 2, 3, 4} (highlighted by a
black circle on the right part of concept lattice) corresponds to the path of high income and
high savings in the generalized decision directed acyclic graph. Moreover, the intent of the
mentioned formal concept is {⟨income, high⟩, ⟨savings, high⟩}, as well. However, the concept
lattice provides the possibility to explore other possible groups of people which are not obtained
by generalized directed acyclic graphs since the entropy function returns only the best splitting
attribute for each node.
   The second concept lattice of Table 1 can be constructed by taking the formal context
⟨𝑂, 𝐴× ∪ {⟨𝑏, 0⟩}, 𝑅⟩ such that 𝑂 is a set of labeled objects with 𝐿𝑏 = 0. Then, we obtain the
formal concepts that can correspond to the paths of people who are not approved for a loan by
generalized decision directed acyclic graph.
Figure 4: Concept lattice of approval for loan (own figure)
                                           loan approved


                                                                    high             high
                                     married                        income           savings


                                                                               self-
      employed                                                                 employed


                              low                                          not
                           savings                                      married


                 income
                   low
           {5, 6, 7}
{8}                                  {1}                      {9}        {3}                   {2, 4}


                                           average income


6. Conclusions and future work
In this paper, we proposed a classification method of generalized decision directed acyclic
graphs which is a generalization of decision jungles from the rooted binary directed acyclic
graph to rooted generalized directed acyclic graph. In the extended version of our paper, we will
present the properties of generalized decision directed acyclic graphs in general. Moreover, we
will emphasize the description of the functions used in our algorithm in more detail and their
importance for our experimental results. The more detailed connections between generalized
decision directed acyclic graphs and Formal Concept Analysis will be explored in the extended
version of the paper, as well.
  Our future research aims to propose the algorithms for the generalized decision directed
acyclic graphs which can be applied for regression tasks with a numerical target attribute. More-
over, another possibility is to apply the genetic algorithms in the search of optimal parameters,
optimize the learning process, and improve the performance of algorithms for testing datasets.
Acknowledgments
This work was partially supported by the Scientific Grant Agency of the Ministry of Education,
Science, Research and Sport of the Slovak Republic under contract VEGA 1/0645/22 Proposal
of novel methods in the field of Formal Concept Analysis and their application. This article
was partially supported by the project KEGA 012UPJŠ-4/2021. This work was supported by the
Slovak Research and Development Agency under contract No. APVV-21-0468.


References
[1] Bělohlávek, R.: Fuzzy concepts and conceptual structures: induced similarities. In: JCIS ’98
    proceedings, International Conference on Computer Science and Informatics, pp. 179–182.
    Association for Intelligent Machinery (1998)
[2] Bělohlávek, R., De Baets, B., Outrata, J., Vychodil, V.: Inducing decision trees via concept
    lattices, Int. J. Gen. Syst. 38(4), 455–467 (2009)
[3] Ben Yahia, S., Jaoua, A.: Discovering knowledge from fuzzy concept lattice. In: Kandel,
    A., Last, M., Bunke, H. (eds.) Data Mining and Computational Intelligence, pp. 169–190.
    Physica-Verlag (2001).
[4] Breiman, L., Friedman, J. H., Olshen, R. A., Stone, C. J.: Classification and regression trees.
    Chapman and Hall/CRC (1984)
[5] Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
[6] Breiman, L.: Random Forests. Mach. Learn. 45, 5–32 (2001)
[7] Burusco, A., Fuentes-González, R.: The study of 𝐿-fuzzy concept lattice. Mathw. Soft Comput.
    3, 209–218 (1994)
[8] Busa-Fekete, R., Benbouzid, D., Kegl, B.: Fast classification using sparse decision DAGs.
    In: Proceedings of the 29th International Conference on Machine Learning 2012, 8 pages.
    Omnipress (2012)
[9] Cordero, P., Enciso, M., Mora, Á., Ojeda-Aciego, M., Rossi, C.: A Formal Concept Analysis
    Approach to Cooperative Conversational Recommendation. Int. J. Comput. Intell. Syst.
    13(1), 1243–1252 (2020)
[10] Dua, D., Graff, C.: UCI Machine Learning Repository. Irvine, CA: University of California,
    School of Information and Computer Science (2019), http://archive.ics.uci.edu/ml. Last
    accessed 27 March 2022
[11] Dubois, D., Medina, J., Prade, H., Ramírez-Poussa, E.: Disjunctive attribute dependencies
    in formal concept analysis under the epistemic view of formal contexts. Inf. Sci. 561, 31–51
    (2021)
[12] Dudyrev, E., Kuznetsov, S. O.: Decision Concept Lattice vs. Decision Trees and Random
    Forests. In: Braud, A., Buzmakov, A., Hanika, T., Le Ber, F. (eds.) ICFCA 2021, LNCS (LNAI),
    vol. 12733, pp. 252-260. Springer, Heidelberg (2021).
[13] Ferré, S., Huchard, M., Kaytoue, M., Kuznetsov, S. O., Napoli, A.: Formal concept analy-
    sis: from knowledge discovery to knowledge processing. In: A Guided Tour of Artificial
    Intelligence Research, Vol. II: AI Algorithms, pp. 411–445. Springer (2020)
[14] Ganter, G., Wille, R.: Formal Concept Analysis, Mathematical Foundation. Springer Verlag
    (1999)
[15] Guillas, S., Bertet, K., Ogier, J.M., Girard, N.: Some links between decision tree and
    dichotomic lattice. In: Bělohlávek, R., Kuznetsov, S. O. (eds.) CLA 2008, Proceedings of the
    Sixth International Conference on Concept Lattices and Their Applications, pp. 193–205.
    Palacký University, Olomouc (2008)
[16] Ignatov, D., Ignatov, A.: Decision stream: Cultivating deep decision trees. In: IEEE 29th
    International Conference on Tools with Artificial Intelligence 2017, pp. 905–912. IEEE (2017)
[17] Ioannou, Y., Robertson, D., Zikic, D., Kontschieder, P., Shotton, J., Brown, M., Crim-
    inisi, A.: (2016). Decision forests, convolutional networks and the models in-between,
    https://arxiv.org/pdf/1603.01250.pdf. Last accessed 27 March 2022.
[18] Krajči, S.: A generalized concept lattice, Logic J. IGPL 13, 543–550 (2005)
[19] Krause, T., Lumpe, L., Schmidt, S.: A link between pattern structures and random forests. In:
    Valverde-Albacete, F. J., Trnečka, M (eds.) Proceedings of the 15th International Conference
    on Concept Lattices and Their Applications, CLA 2020, pp. 131–143. CEUR-WS.org (2020)
[20] Kuznetsov, S.O.: Machine learning and formal concept analysis. In: Eklund, P. (ed.) ICFCA
    2004: Concept Lattices, pp. 287–312. Springer, Berlin, Heidelberg (2004)
[21] Laptev, D., Buhmann, J. M.: Transformation-invariant convolutional jungles. In: Proceed-
    ings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3043–3051.
    IEEE (2015)
[22] Oliver, J. J.: Decision graphs – an extension of decision trees. In: Proceedings of the
    Fourth International Workshop on Artificial Intelligence and Statistics, pp. 343–350. Fort
    Lauderdale, Florida (1993)
[23] Peterson, A.H., Martinez, T.R.: Reducing decision trees ensemble size using parallel decision
    DAGs. Int. J. Artif. Intell. Tools 18(4), 613–620 (2009)
[24] Platt, J. C., Cristianini, N, Shawe-Taylor, J.: Large margin DAGs for multiclass classification.
    In: NIPS’99: Proceedings of the 12th International Conference on Neural Information
    Processing Systems, pp. 547–553. MIT Press, Cambridge (1999)
[25] Pócs, J., Pócsová, J.: Basic theorem as representation of heterogeneous concept lattices,
    Front. Comput. Sci. 9(4), 636–642 (2015)
[26] Pohlen, T.: Decision Jungles, In: Seminar Report, Fakultät für Mathematik, Informatik und
    Naturwissenschaften Lehr – und Forschungsgebiet Informatik (2014)
[27] Quinlan, J. R.: Induction of Decision Trees. Mach. Learn. 1(1), 81–106 (1986)
[28] Quinlan, J. R.: Unknown attribute values in induction. In: Proceedings of the Sixth Inter-
    national Machine Learning Workshop, pp. 164–168. Morgan Kaufmann (1989)
[29] Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1992)
[30] Shotton, J., Sharp, T., Kohli, P., Nowozin, S., Winn, J., Criminisi, A.: Decision jungles:
    Compact and rich models for classification. In: NIPS’13 Proceedings of the 26th International
    Conference on Neural Information Processing Systems, pp. 234–242. Curran Associates Inc.
    (2013)