<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The research of fuzzy decision trees building based on entropy and the theory of fuzzy sets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>S B Begenova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>T V Avdeenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Novosibirsk State Technical University</institution>
          ,
          <addr-line>Karla Marks ave 20, Novosibirsk, Russia, 630073</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>296</fpage>
      <lpage>303</lpage>
      <abstract>
        <p>Decision trees are widely used in the field of machine learning and artificial intelligence. Such popularity is due to the fact that with the help of decision trees graphic models, text rules can be built and they are easily understood by the final user. Because of the inaccuracy of observations, uncertainties, the data, collected in the environment, often take an unclear form. Therefore, fuzzy decision trees are becoming popular in the field of machine learning. This article presents a method that includes the features of the two above-mentioned approaches: a graphical representation of the rules system in the form of a tree and a fuzzy representation of the data. The approach uses such advantages as high comprehensibility of decision trees and the ability to cope with inaccurate and uncertain information in fuzzy representation. The received learning method is suitable for classifying problems with both numerical and symbolic features. In the article, solution illustrations and numerical results are given.Also the comparison of fuzzy logic approaches for building fuzzy rules and classification trees are given.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Nowadays, in the era of big data, the extraction of knowledge is a bottleneck in the field of knowledge
engineering. Computer programs that extract knowledge from data successfully try to solve this
problem.Among these programs, systems for building decision trees for decision-making and
classification tasks are very popular. The knowledge acquired in the form of decision trees and
inference procedures is highly valued for the clarity and visibility of the data. Such assessment, at one
time, aroused interest of scientists, which led to a number of methodological and empirical
achievements. However, initially decision trees were popularized by Quinlan and his ID3 algorithm
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>One of the extensions of the classical construction of decision trees is an approach based on fuzzy
logic. Fuzzy approach is becoming increasingly popular in solving problems of uncertainty, noise and
inaccurate data. It is successfully applied to problems in many industrial spheres. Most studies on the
application of this representative framework to existing methodologies are focused mainly on new
areas, such as neural networks and genetic algorithms. Nowadays, the fuzzy approach that integrates
the concepts of fuzzy sets and entropy is becoming popular.</p>
      <p>This article presents a method that includes the features of the two above-mentioned approaches: a
graphical representation of the rules system in the form of a tree and the fuzzy representation of the
data. Section 2 describes the principle of the decision trees, their advantages and disadvantages,
algorithms for their construction. Section 3 shows the principle of constructing fuzzy decision trees,
introduces the concepts of fuzzy logic. Section 4 describes the results of the study and the last section
gives a conclusion.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Decisiontrees</title>
      <p>
        A decision tree (DT) is a common formalization for mapping the transitions of attribute values to
classes in the form of a map, which consists of attribute nodes or so-called tests that can have two or
more subtrees, leaves, or decision nodes that are labeled with a class indicating the solution. The main
advantage of this approach is the visualization of the solution. One of the most commonly used
algorithms for constructing decision trees is the ID3 method, formalized by Quinlan in 1986 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Decision trees create efficient models for machine learning [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. Let us give the following
characteristics of decision trees:
 they are easily interpretable and visible;
 the model can be expressed both graphically and with text rules;
 they are competitive in comparison with more expensive approaches;
 decisiontrees are scalable;
 they can process discrete and continuous data;
 decision trees can be applied to different sizes of data sets, including large sample sets.
      </p>
      <p>In the process of tree constructing, the pattern is represented by a set of features that are expressed
in some descriptive language. Samples whose characteristics are known are called examples. The
purpose of constructing a tree is to solve the problem of classification or regression.</p>
      <p>ID3 and CART are the two most important discriminating learning algorithms that work by
recursive partitioning. Their basic ideas are approximately the same: splitting the incoming sample
into subsets and representing the partitions as a tree. An important property of these algorithms is that
they simultaneously try to minimize the size of the tree with the optimization of some quality measure.
Subsequently, they use the same logical inference.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Fuzzy decisiontrees</title>
      <p>
        To construct a fuzzy decision tree, the following procedure is proposed [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]:
1. Define the fuzzy data base, i.e., the fuzzy granulation for the domains of the continuous
features.
2. Replace the continuous attributes of the training set using the linguistic labels of the fuzzy sets
with highest compatibility with the input values [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ].
3. Calculate the entropy and information gain of each feature to split the training set and define the
test nodes of the tree until all features are used or all training examples are classified.
Figure 1 shows an example of the fuzzification of continuous data.
      </p>
      <p>The first block of Figure 1 illustrates a dataset with n examples, three attributes (At1, At2,At3) and a
class attribute. The fuzzified version of this dataset is presented in the second block. This fuzzified set
of examples is used to induce the final DT, illustrated in the last block of Figure 1.</p>
      <p>
        The entropy and information gain formulas remain the same for the classical version of the ID3
algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Let us introduce the following notation:
      </p>
      <p>– set of data samples; – set of attributes; – a
singleton set with a solution attribute or class attribute. Let this attribute have m different values, thensi
is the number of samples of set U in class di.</p>
      <p>Information gain I relative to subset Sj is equal to:
,
,
;
where – the number of samples in a subset of S.</p>
      <p>Entropy E(ci) is:
accordingly, the criterion for selecting an attribute is the increase in information:
.</p>
      <p>The difference between common algorithm ID3 and the fuzzy version of algorithm ID3 is that the
attributes of objects have degrees of belonging to a particular node, and it is quite possible that an
attribute with certain probabilities belongs to several nodes.</p>
      <p>Figure 2 shows two decision trees that were built using the above-mentioned algorithms.</p>
      <p>
        As an example, a classic data set was taken, Fisher's iris [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which has 4 attributes: the length and
width of the cup, the length and width of the petal and the three resultant classes - setosa, versicolor
and virginica.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Theresearch results</title>
      <p>As a research object, just as in the previous example, a Fisher's iris data set was used.To construct a
fuzzy decision tree at the first stage, it is necessary to perform a fuzzification procedure.</p>
      <p>While performing the fuzzification procedure, the definition set of fuzzy attributes is divided into
fuzzy subsets. The value of the fuzzy attribute is put in correspondence with the term, and this
correspondence is found using the membership function. The division of the definition set into fuzzy
subsets can be made evenly, that is, the definition set is divided into equal intervals. However, in most
real data sets obtained from the environment, it is preferable to perform the partitioning taking into
account the features of the original sample. For example, it may happen that most of the sampling
objects lie in the first third of the definition set and, in this case, uniform partitioning will not give the
desired effect.The results of fuzzification for attributes SepalLength, SepalWidth, PetalLength and
PetalWidthare shown in Figures 3, 4, 5 and 6 respectively.</p>
      <p>Numeric attributes have been assigned 3 terms (low, medium, high). Let us present the results of
the classification obtained with the help of fuzzy decision trees. In the context of this studies, let us
introduce the following notations:</p>
      <p>Correct - the number of correctly classed sample objects.</p>
      <p>Incorrect - the number of incorrectly classified sample objects.</p>
      <p>WithoutClass - the number of objects without a class.</p>
      <p>PercentCorrect is the percentage of correctly-categorized sample objects that is calculated as
follows:</p>
      <p>To study the hypothesis that with a decrease of the sample size, the accuracy of the classification of
fuzzy decision trees is better than classifying with classical ones, the dependence of the classification
accuracy on the number of instances in the data set was constructed.</p>
      <p>In this study, the trees were constructed for 3 randomly selected N samples and the table shows the
averaged values obtained (the sum of the values of the attributes / 3).</p>
      <p>According to the data presented in Table 1, it can be seen that when the sample is reduced from 150
to 90, the accuracy of classification using fuzzy decision trees is three percent higher than results of
classifying with classical decision trees, and when the sample is reduced to 60, the accuracy is higher
by 0.82 percent.</p>
      <p>The number of instances in the
data set (N)
Table 2 shows the dependence of the accuracy of data classification on the number of terms.</p>
      <p>According to the data in the table, it is clear that the optimal number of terms for the test set of data
is 5. Such quantity gave a higher percentage of correctly classified data compared to 3 terms.</p>
      <p>
        As a part of fuzzy decision trees research, we compare two methods of classification based on
fuzzy logic. The first one is algorithm of direct generating fuzzy linguistic rules, proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The
second method of fuzzy decision trees which is proposed in this article was used. Fig.7 gives visual
illustration of comparison between the two methods for sequentially growing number of terms.
      </p>
      <p>Here we can observe that the classification accuracy for sequentially growing number of terms
(from 3 cases to 7) remains quite high. Method of fuzzy decision trees is better for medium and high
sizes of terms while the method of direct generating fuzzy rules is better for small size of training
sample. Further research will be in development of algorithm based on combination of both
approaches.</p>
      <p>On fig. 8 we can observe that the classification accuracy for sequentially reducing size of training
sample (from 105 cases to 45) remains quite high. Method of fuzzy decision trees is better for medium
size of training sample while the method of direct generating fuzzy rules is better for small size of
training sample. Further research will be in development of algorithm based on combination of both
approaches.</p>
      <p>Fig. 9 and Fig. 10 illustrate the comparison between the two methods with sequentially reducing
size of training sample using T-class and S- class membership functions respectively.</p>
      <p>T – class membership function also known as triangular is specified by three parameters {a, b, c}
as follows:
 ( ,  ,  ,  ) =
{
 − 
 − 
 − 
 − 
0,  ≤ 
,  ≤  ≤ 
,  ≤  ≤ 
0,  ≤ 
1 +  − ( − )</p>
      <p>0,  ≥</p>
      <p>For T-class membership functions, we observe that fuzzy decision trees method is better for large
size of training sample while the method of direct generating fuzzy rules is better for small and
medium size of training sample. In case of S-class membership functions, we observe the same
situation.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Decision trees are successfully used to solve regression and classification problems. They are popular
in the field of machine learning, because decision trees build graphic models, along with text rules that
are easily interpreted by the users. On the other hand, fuzzy systems can solve classification problems
with input inaccurate and noisy data.</p>
      <p>
        The combination of fuzzy trees and fuzzy logic makes it possible to construct intuitive graphic
models for qualitative and quantitative data [
        <xref ref-type="bibr" rid="ref2 ref7 ref8">2, 7, 8</xref>
        ]. Usage of this type of decision tree gives us
several solutions with different probabilities of belonging to a particular class.
      </p>
      <p>In addition, in the course of the conducted studies, the advantage of classification using fuzzy
decision trees with respect to classical ones was revealed, by comparing the percentage of correctly
classed objects. Also, a direct correlation between the accuracy of the classification and the value of
the information gain was revealed (the increment is a criterion for stopping the further construction of
the tree).</p>
      <p>
        The comparison between algorithm of direct generating fuzzy linguistic rules and method of fuzzy
decision trees didn’t reveal the one and only right one. Both methods show high classification
accuracy under certain conditions. The proposed approach can be applied to built fuzzy neural
networks [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Acknowledgments
The work is supported by a grant from the Ministry of Education and Science of the Russian
Federation within the framework of the project part of the state task, project No. 2.2327.2017 / 4.6
“Integration of knowledge representation models based on intellectual analysis of large data to support
decision making in the field of software engineering.”
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Quinlan</surname>
            <given-names>J R</given-names>
          </string-name>
          <year>1986</year>
          <article-title>Induction of decision trees</article-title>
          <source>Machine learning</source>
          <volume>1</volume>
          <fpage>81</fpage>
          -
          <lpage>106</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Cintra</surname>
            <given-names>M E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meira</surname>
            <given-names>C A A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monard</surname>
            <given-names>M C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Camargo H A and Rodrigues L H 2011</surname>
          </string-name>
          <article-title>The use of fuzzy decision trees for coee rust warning in Brazilian Int</article-title>
          .
          <source>Conf. Int. Sys. Design &amp; Applications</source>
          <volume>1</volume>
          <fpage>1347</fpage>
          -
          <lpage>1352</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Avdeenko</surname>
            <given-names>T V</given-names>
          </string-name>
          and
          <string-name>
            <surname>Makarova E S 2017</surname>
          </string-name>
          <article-title>Acquisition of knowledge in the form of fuzzy rules for cases classification</article-title>
          <source>Lecture Notes in Computer Science</source>
          <volume>10387</volume>
          <fpage>536</fpage>
          -
          <lpage>544</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Cintra</surname>
            <given-names>M E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monard</surname>
            <given-names>M C</given-names>
          </string-name>
          and
          <string-name>
            <surname>Camargo H A 2012 Fuzzy DT</surname>
          </string-name>
          <article-title>- a fuzzy decision tree algorithm based on C4.5 CBSF -</article-title>
          Brazilian
          <source>Congress on Fuzzy Systems 199-211</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Janikow C Z 1998 Fuzzy Decision</surname>
          </string-name>
          <article-title>Trees: Issues and</article-title>
          <source>Methods IEEE Transactions of Man, Systems, Cybernetics</source>
          <volume>28</volume>
          (
          <issue>1</issue>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Faifer</surname>
            <given-names>M</given-names>
          </string-name>
          and
          <string-name>
            <surname>Janikow C Z 2000</surname>
          </string-name>
          <article-title>Bottom-up Partitioning in Fuzzy Decision Trees</article-title>
          <source>Proceedings of the 19th International Conference of the North American Fuzzy Information Society 326-330</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Tokumaru</surname>
            <given-names>M</given-names>
          </string-name>
          and
          <string-name>
            <surname>Muranaka N 2010</surname>
          </string-name>
          <article-title>Impression analysis using fuzzy c4.5 decision tree Int. con. on Kansei engineering</article-title>
          and emotion research
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Janikow</surname>
            <given-names>C Z</given-names>
          </string-name>
          <year>2004</year>
          <article-title>Fid 4.1: an overview Proc</article-title>
          .
          <source>of the North American Fuzzy Information Processing Society 877-881</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Machine</given-names>
            <surname>Learning Repository</surname>
          </string-name>
          (Access mode: http://archive.ics.uci.edu/ml/datasets/Iris) (
          <volume>30</volume>
          .
          <fpage>05</fpage>
          .
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Begenova S B and Avdeenko</surname>
            <given-names>T V</given-names>
          </string-name>
          <year>2018</year>
          <article-title>Building of fuzzy decision trees using ID3 algorithm</article-title>
          <source>Journal of Physics: Conference Series 1015</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Cintra</surname>
            <given-names>M E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meira</surname>
            <given-names>C A A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monard</surname>
            <given-names>M C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Camargo H A and Rodrigues L H 2011</surname>
          </string-name>
          <article-title>The use of fuzzy decision trees for coffee rust warning in Brazilian crop Int</article-title>
          .
          <source>Conf. Int. Sys. Design &amp; Applications</source>
          <volume>1</volume>
          <fpage>1347</fpage>
          -
          <lpage>1352</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Olaru</surname>
            <given-names>C</given-names>
          </string-name>
          and
          <string-name>
            <surname>Wehenkel L 2003 Fuzzy</surname>
            <given-names>Sets</given-names>
          </string-name>
          <source>and Systems</source>
          <volume>138</volume>
          <fpage>221</fpage>
          -
          <lpage>254</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Soldatova</surname>
            <given-names>O P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lezin</surname>
            <given-names>I A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lezina</surname>
            <given-names>I V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>A V</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kirsh D V 2015</surname>
          </string-name>
          <article-title>Application of fuzzy neural networks to determine the type of crystal lattices observed on nanoscale images</article-title>
          <source>Computer Optics</source>
          <volume>39</volume>
          (
          <issue>5</issue>
          )
          <fpage>787</fpage>
          -
          <lpage>795</lpage>
          DOI: 10.18287/
          <fpage>0134</fpage>
          -2452-2015-39-5-
          <fpage>787</fpage>
          -794
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>