<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Evaluation of Decision Table Decomposition Using Dynamic Programming Classifiers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michal Mankowski</string-name>
          <email>m.mankowski@stud.elka.pw.edu.pl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tadeusz Luba</string-name>
          <email>luba@tele.elka.pw.edu.pl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cezary Jankowski</string-name>
          <email>c.jankowski@stud.elka.pw.edu.pl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Warsaw University of Technology, Institute of Radioelectronics</institution>
          ,
          <addr-line>Warsaw</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Warsaw University of Technology, Institute of Telecommunications</institution>
          ,
          <addr-line>Warsaw</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <fpage>34</fpage>
      <lpage>43</lpage>
      <abstract>
        <p>Decision table decomposition is a method that decomposes given decision table into an equivalent set of decision tables. Decomposition can enhance the quality of knowledge discovered from databases by simplifying the data mining task. The paper contains a description of decision table decomposition method and their evaluation for data classification. Additionally, a novel method of obtaining attributes sets for decomposition was introduced. Experimental results demonstrated that decomposition can reduce memory requirements preserving the accuracy of classification.</p>
      </abstract>
      <kwd-group>
        <kwd>Data Mining</kwd>
        <kwd>Decomposition</kwd>
        <kwd>Clasifications</kwd>
        <kwd>Dynamic Programming</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Increasing amount of data requires to develop new data analysis methods. A common
approach in data mining is to make a prediction based on decision tables.
Decomposition of a decision table to smaller subtables could be obtained by the divide and conquer
strategy. This idea comes from logic synthesis and functional decomposition [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>The fundamentals of decision system and logic synthesis are different, but there are
many similarities between them. The decision system is usually described by a
decision table, and combinational logic of a digital system by a truth table. Input variables
of digital systems correspond to conditional attributes. Therefore, multiple terms from
logic synthesis may be extended to data mining. Functional decomposition could be
used to build a hierarchical decision system.</p>
      <p>
        The functional decomposition was firstly used in logic synthesis of digital systems.
In this situation, decomposition involves breaking a large logic functions, which are
difficult to implement, into several smaller ones, which can be easy to implement. A
similar problem in machine learning relies on disassembling the decision table to the
subsystems in such a way that the original decision table can be recreated through a
series of operations corresponding to the hierarchical decision making. But the most
important is that we can induce noticeably simpler decision rules and trees for the
resulting components to finally make the same decision as for the original decision table.
[
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2–4</xref>
        ]
      </p>
      <p>
        For evaluation of decomposition, the decision trees and rules classifiers based on
extensions of dynamic programming [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] were used. Decision tree and rules were
sequentially optimized for different cost functions (for example, relative to number of
misclassification and depth of decision trees). In the case of decision trees and rules this
approach allowed to describe set of trees or rules by directed acyclic graph (DAG).
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Basic Concepts</title>
      <sec id="sec-2-1">
        <title>Preliminary Notions</title>
        <p>Information system is a pair A = (U, A), where U is a non-empty, finite set of objects
called the universe, A is a non-empty, finite set of attributes, i.e. each element a ∈ A
is a function from U into Va, where Va is the domain of a called value set of a. Then,
the function ρ maps the product of U and A into the set of all values. The value of the
attribute for a specific object is denoted by ρ(ut, ai), with ut ∈ U , ai ∈ A</p>
        <p>One or more distinguished attributes from set A of information system may indicate
a decision from rest of attributes. Such information system is called decision system.
Formally, decision system is information system denoted by A = (U, A ∪ D), where
A ∩ D = ∅. Attributes in set A are referred to as conditional attributes while attributes
in set D are referred to as decision attributes. However, in the case when function ρ
maps U × (A ∪ D) into the set of all attribute values such system is called decision
table.</p>
        <p>Let A = (U, A) be an information system. For each subset B ⊆ A we define
B-indiscernibility relation IN DA(B):</p>
        <p>IN DA(B) = (up, uq) ∈ U 2 : ∀ai ∈ B, ρ(up, ai) = ρ(uq, ai)
(1)
The attribute values a1, i.e. ρpi = ρ(up, a1) and ρqi = ρ(uq, a1) are compatible (ρpi ∼
ρqi) if, and only if, ρpi = ρqi or ρpi = ∗ or ρqi = ∗, where "∗" represents attributes
value "do not care". In the other case ρpi and ρqi are not compatible (ρpi ≁ ρqi).</p>
        <p>The consequence of this definition is compatibility relation COMA(B) associated
with every B ⊆ A :</p>
        <p>
          COMA(B) = (up, uq) ∈ U 2 : ∀ai ∈ B, ρ(up, ai) ∼ ρ(uq, ai)
(2)
COMA(B) classifies objects by grouping them into compatibility classes, i.e.
U/COMA(B), where B ⊆ A. Collection of subsets U/COM (B) is called r-partition
on U and denote as ΠA(B). R-partition on a set U may be viewed as a collection of
non-disjoint subsets of U , where the set union is equal U . All symbols and operations
of partition algebra [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] are applicable to r-partitions. The r-partition generated by a set
B is the product of r-partitions generated by the attributes ai ∈ B:
ΠA(B) = \ ΠA({ai})
i
(3)
If B = {ai1, ..., aik}, the product can be expressed as: Π(B) = Π(ai1) · ... · Π(aik).
We will write often · instead of ∩.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Hierarchical Decomposition</title>
        <p>To compress data and accelerate computations, hierarchical decomposition can be
applied. The goal is to break down a decision table into two smaller subtables.</p>
        <p>Let F be a functional dependency D = F(A) for a consistent decision system A =
(U, A∪D), where A is a set of conditional attributes and D is a set of decision attributes.
Let B1, B2 be subsets of A such that A = B1 ∪ B2 and B1 ∩ B2 = ∅. A simple
hierarchical decomposition relative to B1, B2 exists for F(A) if and only if:
F(A) = H(B1, G(B2)) = H(B1, δ)
(4)
where G and H represent the following functional dependencies: G(B2) = δ and
H(B1, δ) = D, where δ is a intermediate attribute. The output of functions F(A) and
H are exactly the same. In other words we try to find a function H depending on the
variables of the set B1 as well on the output δ of a function G depending on the set B2.</p>
        <p>In Theorem 1., r-partition ΠG represents component G, and the product of
rpartitions Π(B1) and ΠG corresponds to H. The decision tables of the resulting
components can be easily obtained from these r-partitions.</p>
        <p>According to Theorem 1 the main problem is to find a partition ΠG. To solve
that problem, a subset of original attributes B2 and an m-block partition Π (B2) =
{K1, K2, ..., Km} generated by that subset is appropriate to consider. Two blocks
Ki, Kj of partition Π (B2) are compatible if and only if the partition Π ′ obtained from
Π (B2) by joining the blocks Ki and Kj into a single block Kij (without changing the
′
other ones) satisfies the equation (5), i.e. iff Π · ΠG ≤ Π (D). Otherwise, Ki, Kj are
incompatible.</p>
        <p>For decision table from Table 1 and sets of attributes B1 = {a0, a5}, B2 =
{a1, a2, a3, a4} the following set of incompatible pairs could be found:
E = {(K1, K8), (K2, K4), (K2, K8), (K3, K7)(K4, K5)}. The subset of n partition
blocks, Π (B2) = {Ki1 , Ki2 , ..., Kin } where Kij ∈ Π (B2) is the compatible class
of Π (B2) partition blocks iff all blocks of that subset are pairwise compatible. The
compatibility class is referred to as maximal compatibility class (MCC) iff it does not
belong to any other compatibility class of the partition concerned.</p>
        <p>K7</p>
        <p>K8
K6</p>
        <p>K1
K5</p>
        <p>K2
K4</p>
        <p>K3</p>
        <p>The decomposition process may be interpreted in terms of an incompatibility graph
(Fig. 1). The edges represent the incompatible pairs of partition Π (B2) : (K1, K8),
(K2, K4), (K2, K8), (K3, K7), (K4, K5). It is clearly visible that the proper coloring
of the graph specifies the compatible classes: {K1, K2, K3, K5} , {K4, K6, K7, K8}
and, as a consequence, the partition
ΠG =</p>
        <p>0, 1, 2, 4, 5, 7, 9; 3, 6, 8 .</p>
        <p>
          Another approach to building an incompatibility graph is to create a labeled
partition matrix [
          <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
          ] (Table 2). It should be noted that the columns represent all possible
combinations of the attributes values in B2. Each column thus denotes the behavior of
decision table when the attributes in B1 set are constant. Therefore each column can be
treated as object from decision table. To build incompatibility graph it is necessary to
apply the equation (2) to each pair of columns. When the compability relation is met
then pair is compatible, otherwise it is incompatible.
Simple hierarchical decomposition requires to divide a set of conditional attributes A
to two disjoint subsets B1 and B2. Proposed idea of obtaining sets is based on the
attributes relationship called attributes dependency from Rough Set theory [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>Let C and B be sets of attributes, then B depends entirely on a set of attributes
C, denoted C ⇒ B, if all values of attributes from B are uniquely determined by the
values of attributes from C. If B depends in degree k, 0 ≤ k ≤ 1, on C, then:
(6)
(7)
(8)
where
k = γ(C, B) =
|P OSC (B)|</p>
        <p>|U |
P OSC (D) =</p>
        <p>[
X∈U/B</p>
        <p>C∗(X)
called a positive region of the partition U/B with respect to C, is the set of all
elements of U that can be uniquely classified to blocks of the partition U/B, by means
of C.</p>
        <p>Proposed method allows us to measure dependency between all possible pairs of
conditional attributes and decision attribute. Related dependency of one conditional
attribute can be generated from a given information system: A = (U, A ∪ {d}), where
A = {a0, ..., ak} is a set of conditional attributes and d is a decision attribute:
r(x) =</p>
        <p>P0k γ({x, ai} , {d})
|A|
, x ∈ A</p>
        <p>The above function of related dependency is used for comparison of attributes. This
function is being calculated for each attribute, then the results are being sorted by the
value of function r. The most dependent attributes are put in set B1, which corresponds
to the final decision table H.
Example 1. For Table 1 first step of the algorithm is to built a matrix of attribute
dependency between each pair of condition attributes and decision attribute. Then the mean
of partial results is calculated, which is represented by related dependency r(x) in
Table 3. These results can be sorted by value and divided into two equinumerous sets.
If the number of attributes is odd, then set |B1| = |B2| + 1. An example of sorting
and assignment attributes is presented in Table 4. The calculation of related
dependency r(x) allows formulating an accurate method for assessing sets B1 and B2, i.e.
B1 = {a1, a3, a5} and B2 = {a0, a2, a4}. Therefore the decomposition is as follows:
F(A) = H({a1, a3, a5} , G({a0, a2, a4})).</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Classification Schema</title>
      <p>
        Hierarchical Decision Making
Due to the decomposition of decision tables, there is a need for hierarchical decision
system to evaluate this method for the purpose of classification [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This method is
based on disassembling the decision table into the subtables. The most important
advantage is the possibility to induce a simpler classification model, for example shorter
decision rules or smaller decision tree for the resulting components to finally make the
same decision as for the original decision table. Following the process of
decomposition, we propose to take decisions hierarchically. For the part of the attributes B2 a
prediction model to calculate intermediate decision was built. Then this intermediate
decision was used simultaneously with the attributes from B1 to build final
classification model. Then, on the basis of both, i.e., these attributes and the intermediate decision
δ, the final decision was taken (Fig. 2).
For decision prediction, the approach based on an extension of dynamic programming
was used. These methods were developed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. They allow sequential optimization
of decision trees and rules relative to different cost functions, in particular between the
      </p>
      <p>B1</p>
      <p>B2
G</p>
      <p>Intermediate
decision
H</p>
      <p>
        F Fdiencailsion
number of misclassifications and the depth of decision trees or the length of decision
rules. Proposed algorithm constructs a directed acyclic graph (DAG), which represents
structure of subtables of initial tables. For decision table A separable subtables of A
described by systems of equalities of the kind ai = b are considered as subproblems,
where ai is an attribute and b is its value. Classification and optimization of decision
trees and rules are discussed in details in [
        <xref ref-type="bibr" rid="ref10 ref5">5, 10</xref>
        ].
      </p>
      <p>
        In the applied approach to optimization of decision trees directed acyclic graph
(DAG) represents a set of CART-like decision trees [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Set of Pareto optimal points
for bi-criteria optimization problem is constructed. Two types of decision tree pruning
have been compared. First is the multi-pruning for which, using validation subtable
(part of training subtable) for each Pareto optimal point, a decision tree with minimum
number of misclassification is found. Second, as an improvement of multi-pruning, is
to use only the best split for a small number of attributes in each nodes of DAG graph,
instead of using the best split for all attributes. This pruning is called restricted
multipruning.
      </p>
      <p>
        The system of decision rules as a prediction model was also considered. As in case
of decision trees we used dynamic programming algorithm to create and optimize
decision rules [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>
        To evaluate the proposed decomposition algorithm and hierarchical decision making
idea the Dagger software system created in King Abdullah University of Science and
Technology was used. Proposed algorithm has been tested on categorical datasets from
UCI ML Repository [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. A data sets were preprocessed. Duplicate rows were removed.
There were some inconsistencies, i.e., there are instances with the same values of
conditional attributes, but their decisions are different. The solution was to replace such set
with a single row with most common decision. Results were obtained using the
twofold Cross-Validation evaluation repeated 100 times, each time using a different random
selected testing subset. From training part, 70% of rows was used to generate decision
trees and remaining part is preserved for validation.
      </p>
      <p>data set rows attributes compression (SD/S)
flags 196 27 0.801
house 281 17 0.395
kr-vs-kp 3198 37 0.209
breast cancer 268 10 0.754</p>
      <p>cars 1730 7 0.261
spect-test 169 23 0.751
dermatology 366 35 0.352</p>
      <p>The advantage of decomposition is due to the fact that two components (i.e. tables
G and H) require less memory than the original decision table. Let us express the size
of the table as S = n Pi bi, where n is the number of objects, and bi = log2 |Vai |
is the number of bits required to represent attribute ai. Then, after the decomposition,
we may compare the size of specific components with that of the original table (prior to
decomposition). Results of compression are presented in Table 5.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>
        However, for most of measurements, accuracy keeps a significant level or is slightly
better. The biggest improvement occurring when dynamic programing rules were used.
Effective data aggregation algorithms have been sought after for a long time due to the
increasing complexity of databases used in practice. Recently, some suggestions were
put forward that decomposition algorithms, previously used mainly in logic synthesis
of digital systems, may be applied for that purpose [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This approach is indeed very
relevant as decision systems and logic circuits are very similar. Bearing this in mind,
this paper demonstrates that a typical algorithm for the decomposition of binary data
tables (representing Boolean functions) may be applied to the decomposition of data
represented by multi-valued attributes used in decision systems.
      </p>
      <p>The paper indicates the advantages and possibilities of decomposition algorithms
for the purpose of classification. Results of experiments performed by proposed
decomposition algorithm and Dagger system has been presented. New attributes
selection criteria describing partitions for decomposition has been introduced and used in
the experiments. Proposed method is particularly efficient in data compression. It
allows to build simple classification model and save memory, simultaneously keep the
accuracy. To achieved better results in accuracy data set decomposition requires further
research, particularity with attributes selection criteria. Also, there is a need to extend
the decomposition to deal with continuous attributes and noise in data.
Acknowledgments. The authors would like to thank professor Mikhail Moshkov and
his team for their support while writing this paper. This research has been supported by
King Abdullah University of Science and Technology.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Perkowski</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luba</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grygiel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burkey</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burns</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iliev</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolsteren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lisanke</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malvi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , et al.:
          <article-title>Unified approach to functional decompositions of switching functions</article-title>
          .
          <source>PSU Electr. Engn. Dept. Report</source>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Łuba</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lasocki</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rybnik</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>An implementation of decomposition algorithm and its application in information systems analysis and logic synthesis</article-title>
          . In Ziarko, W., ed.: Rough Sets, Fuzzy Sets and
          <string-name>
            <given-names>Knowledge</given-names>
            <surname>Discovery</surname>
          </string-name>
          . Workshops in Computing. Springer London (
          <year>1994</year>
          )
          <fpage>458</fpage>
          -
          <lpage>465</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Luba</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lasocki</surname>
          </string-name>
          , R.:
          <article-title>On unknown attribute values in functional dependencies</article-title>
          .
          <source>In: Proceedings of the International Workshop on Rough Sets and Soft Computing</source>
          ,(San Jose, CA).
          <article-title>(</article-title>
          <year>1994</year>
          )
          <fpage>490</fpage>
          -
          <lpage>497</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Rokach</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maimon</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Data mining using decomposition methods</article-title>
          . In Maimon,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Rokach</surname>
          </string-name>
          , L., eds.
          <source>: Data Mining and Knowledge Discovery Handbook</source>
          . Springer US (
          <year>2010</year>
          )
          <fpage>981</fpage>
          -
          <lpage>998</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Alkhalid</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chikalov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moshkov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zielosko</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Optimization and analysis of decision trees and rules: dynamic programming approach</article-title>
          .
          <source>International Journal of General Systems</source>
          <volume>42</volume>
          (
          <issue>6</issue>
          ) (
          <year>2013</year>
          )
          <fpage>614</fpage>
          -
          <lpage>634</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Borowik</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>In: Data Mining Approach for Decision and Classification Systems Using Logic Synthesis Algorithms</article-title>
          . Volume 6. Springer International Publishing (
          <year>2014</year>
          )
          <fpage>3</fpage>
          -
          <lpage>23</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Zupan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prof</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Bratko,
          <string-name>
            <surname>D.I.:</surname>
          </string-name>
          <article-title>Machine learning based on function decomposition</article-title>
          .
          <source>Technical report</source>
          , University of Ljubljana (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Zupan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bohanec</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Experimental evaluation of three partition selection criteria for decision table decomposition</article-title>
          .
          <source>Informatica (Slovenia)</source>
          <volume>22</volume>
          (
          <issue>2</issue>
          ) (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Pawlak</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Rough sets</article-title>
          .
          <source>International Journal of Computer &amp; Information Sciences</source>
          <volume>11</volume>
          (
          <issue>5</issue>
          ) (
          <year>1982</year>
          )
          <fpage>341</fpage>
          -
          <lpage>356</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Amin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chikalov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moshkov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zielosko</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Dynamic programming approach to optimization of approximate decision rules</article-title>
          .
          <source>Information Sciences</source>
          <volume>221</volume>
          (
          <issue>0</issue>
          ) (
          <year>2013</year>
          )
          <fpage>403</fpage>
          -
          <lpage>418</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Breiman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stone</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Olshen</surname>
            ,
            <given-names>R.A.</given-names>
          </string-name>
          :
          <article-title>Classification and Regression Trees</article-title>
          . CRC press (
          <year>1984</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Lichman</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>UCI machine learning repository (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Maimon</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rokach</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Decomposition methodology for knowledge discovery and data mining</article-title>
          . In Maimon,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Rokach</surname>
          </string-name>
          , L., eds.
          <source>: Data Mining and Knowledge Discovery Handbook</source>
          . Springer US (
          <year>2005</year>
          )
          <fpage>981</fpage>
          -
          <lpage>1003</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>