=Paper= {{Paper |id=Vol-3137/paper1 |storemode=property |title=The Problem of Convergence of Classifiers Construction Procedure in the Schemes of Logical and Algorithmic Classification Trees |pdfUrl=https://ceur-ws.org/Vol-3137/paper1.pdf |volume=Vol-3137 |authors=Igor Povkhan,Oksana Mulesa,Olena Melnyk,Yuriy Bilak,Volodymyr Polishchuk |dblpUrl=https://dblp.org/rec/conf/cmis/PovkhanMMBP22 }} ==The Problem of Convergence of Classifiers Construction Procedure in the Schemes of Logical and Algorithmic Classification Trees== https://ceur-ws.org/Vol-3137/paper1.pdf
The problem of convergence of classifiers construction
procedure in the schemes of logical and algorithmic
classification trees
Igor Povkhana, Oksana Mulesaa, Olena Melnyka, Yuriy Bilaka, Volodymyr Polishchuka
a
    Uzhhorod National University, Zankoveckoy str., 89B, Uzhhorod, 88000, Ukraine


                 Abstract
                 The paper considers the problem of convergence in the procedure of classifier schemes
                 synthesis by methods of logical and algorithmic classification trees. It suggests the upper
                 evaluation of complexity for the scheme of algorithms tree in the problem of approximation
                 of real data array by a set of generalized features with a fixed criterion of termination of the
                 branching procedure at the stage of the classification tree construction. This approach
                 provides the required accuracy of the model, evaluates its complexity, reduces the number of
                 branches, and achieves the required performance features. The paper represents the
                 evaluation of convergence for the procedure of recognition schemes construction is logical
                 and algorithmic classification trees on conditions of weak and robust separation of primary
                 initial sampling classes.

                 Keywords 1
                 logical tree, algorithmic tree, classifier, pattern recognition, feature, initial sample.

1. Introduction

   Problems united by pattern recognition are various and currently widely spread in both the
economic and social content of the human activity, leading to the necessity of building and
investigating the mathematical models of the relevant systems [1-5]. A universal approach to their
solution is still missing, while plenty of general theories and approaches help solve different problem
types (classes). However, their practical application varies by high sensitivity to the specific
parameters of the problem itself or subject area of application [6]. Various theoretical results derive
from exceptional cases and subproblems, pointing out that the bottleneck of successful real
recognition systems requires performing a considerable amount of computation and focusing on
powerful hardware tools. The classification trees (solution trees) have fixed a significant part of the
above shortcomings. It enables to work effectively with problems of arbitrary scale data (where the
information is set in natural form) [7-10].
   Today there are various relevant approaches to constructing classification systems in the form of
logical trees and algorithmic classifications (LCT/ACT) [11-13]. Moreover, the interest in recognition
methods using LCT is caused by several valuable properties. From one side, the complexity of the
class of recognition functions (RF) in the form of LCT models, under certain conditions, does not
exceed the complexity of the class of linear recognition functions (the simplest of the known). On the
other side, RF in the form of classification trees allows distinguishing in the process of classification
both causal links (and unambiguously take them into account in the future) and factors of chance or
uncertainty. Additionally, it considers both functional and stochastic links between properties and

CMIS-2022: The Fifth International Workshop on Computer Modeling and Intelligent Systems, Zaporizhzhia, Ukraine, May 12, 2022
EMAIL igor.povkhan@uzhnu.edu.ua (I. Povkhan); oksana.mulesa@uzhnu.edu.ua (O. Mulesa); olena.melnyk@uzhnu.edu.ua (O. Melnyk)
yuriy.bilak@uzhnu.edu.ua (Y. Bilak); volodymyr.polishchuk@uzhnu.edu.ua (V. Polishchuk)
ORCID: 0000-0002-1681-3466 (I. Povkhan); 0000-0002-6117-5846 (O. Mulesa); 0000-0001-7340-8451 (O. Melnyk);
0000-0001-5989-1643 (Y. Bilak); 0000-0003-4586-1333 (V. Polishchuk);
            © 2022 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)
behavior of the entire system and the process of missing interactions classification between
environmental objects, different animal species, and people (except the objects, information about
which is transmitted genetically (hereditary) or other). It takes place according to the so-called logical
decision tree [14].
    In most forecasting and classification problems, which use unstructured data (such as sets of
discrete patterns or text arrays), an artificial neural network (of selected type) outperforms all other
types of algorithms or decision tree frameworks. Otherwise (in the case of structured high-volume
discrete data arrays), the methods and algorithms of the decision tree concept are vastly preferred [15-
17]. Practically, LCT algorithms and construction methods usually give structurally complex logical
trees at the output (in terms of the number of vertices, number of branches belonging to the class of
irregular trees), which are unevenly filled with data and have different numbers of branches. Such
complex tree-like structures are difficult to perceive for external analysis due to a massive number of
nodes (vertices) and a massive number of stepwise partitions of the primary initial sample (IS)
containing a minimum number of objects (maybe even single objects in the worst case). In practice,
let us review the fundamental question regarding methods of classification trees (classification
models) – the question of convergence of procedure for constructing a classification tree (methods of
classification trees), structures LCT/ACT.


2. Formal problem statement
   Let it be the IS in the following form:
                                               ( x1 , f R ( x1 )),..., ( xm , f R ( xm ))                          (1)
    Consider xi  G , (i  1,..., m) , where m  is the quantity of objects with the IS, f R ( xi ) – is a
finite value function that designates the partition of R for the set G into classes (patterns)
 H 0 ,..., H k 1 . Ratio f R ( xi )  l (l  1,..., k  1) means xi  H l , xi  { xi1 ,..., xin } , xi j  value j 
kind features for object xi , ( j  1,..., n) , n  quantity of features in IS.
    As a result, IS is a population (or the sequence) of the set, where each set is a population of
indicated values and functional values [18]. Additionally, the set of indicated values is a particular
image, where the value of the function relates it to the particular pattern. The problem is to build a
structure LCT/ACT  L based on the array of primary IS base of the type (1) and determine the value
of its structural parameters p (that is F ( L( p, xi ), f R ( xi ))  opt ).



3. Literature review

    All basic approaches in the theory of recognition have their advantages and disadvantages and
form a single toolkit for solving practical problems of artificial intelligence theory — especially the
integrally elaborated classical algebraic approach developed by Y.I. Zhuravlyov [19]. This direction
of recognition theory development is connected with the classification algorithms model construction
and the optimized quality recognition algorithm selection. The research focuses on the actual concept
of decision trees (LCT structures). In particular, The classification scheme, which is given by an
arbitrary approach and method and algorithm of the classification tree, has a tree-like logical structure
[1,3,20]. Moreover, the structure of the logical tree consists of vertices (features), grouped by circles
and built (selected) at a certain step (stage) of classification tree model construction [21]. The main
peculiarity of tree-like recognition systems is that the importance of individual features (groups of
features or sets of features) is determined relative to the function that defines the division of objects
into classes [22].
    In the research [21], a generation scheme for the classification tree structure is suggested based on
the step-by-step selection of elementary attributes, the disadvantage of which is the high dependence
of the model complexity on the effectiveness of the final minimization, tree cutting procedure. In
researches [23-25], the modular scheme of classifiers construction has been offered in the form of
classification tree structures, which circumvents the limitations of traditional decision tree methods.
The research [24] proposes an efficient scheme for generating generalized features based on
constructing sets of geometric objects. However, the disadvantage is the limitation of the initial
training structure of the sample and the lack of practical application versatility. In addition, challenges
of LDK models structural complexity estimating the minimization stage evaluated in research [20].
    Thus, the research [22] shows that the resulting classification rule has a tree-like logical structure
built by an arbitrary method or the indication branched selectional algorithm. In this case, the question
of qualitative branching criterion selection comes first. The logical tree consists of vertices grouped
on tiers and received at a particular step of recognition tree construction [25]. Thereby, it challenges
the effective structure minimization for the constructed model of the classification tree. A significant
problem that arises from article [23] is the synthesis of recognition trees, which the actual tree of
algorithms will represent. An essential area of LDK structures research remains the issues of the
generation of decision trees for the case of uninformative features [14] and the actual issue of
classification trees theory. It questions the possibility of constructing all logical tree variants, which
correspond to primary IS, and selecting minimum depth or structural complexity (number of tiers)
classification tree [26-30].


4. Convergence of synthesis of logical and algorithmic classification tree
   structure
   Consider that at every step during building a logical tree (some LCT model), only one selected
elementary feature is picked from the set of the fixed features (1 ,..., n ) . Then on n  th step of the
tree classification construction procedure, the LCT scheme represents by itself a predicate pn
(generalized feature, which is built from the set of elementary features) [23, 30], that is the most
effective approximation of the initial IS of general view (1) (applicable for the ACT structure case).
    In particular, pn represents some tree-like scheme (classification tree), which consists of n
vertices. The structure of the predicate pn includes only n elementary features (attributes of the IS
discrete object) from the initial set.
   The sequence of predicates p1 ,..., p j (generalized features) coincides with the primary IS of the
(1) kind. In case it starts from some Q , the following condition fulfills:
                                     f Q  m  f R ( xi ) , (i  1,..., m) , ( m  0) .
   Let us denote with  n some elementary features selected (fixed) on n  th step in the scheme of
the LCT model construction. The feature  n corresponds to some fixed path r1 , r2 ,... , which ends by
the given attribute (the vertex of classification tree – LCT model). For example, on Fig. 1 there is the
LCT in which the vertex 2 (feature) corresponds to the path {0} , and the vertex 5 corresponds to
the path {0,1} .
   The path that corresponds to the elementary sign  n as indicated, let us denote as Tn , and with
Dn let us denote the set of those pairs ( xi , f R ( xi )) of the primary IS of the general view (1), where
objects wi belong to the path Tn . For example, for the LCT structure (Fig. 1), let n  4 , then the
path Tn looks as {1,0} .
   In such a case some object wi belongs to the path {1,0} , if the following conditions are fulfilled:
1 ( wi )  1 and 1 ( wi )  0 .
    We consider further, that the elementary feature n weakly separates the set Dn , if in Dn there
are such pairs                ( xi , f R ( xi )) and ( x j , f R ( x j )) , that n ( xi )  0 and  n ( xi )  1 (that is
 n ( xi )   n ( x j ) ).
    .




Figure 1: LCT structure with elementary features as vertices.

  In the final power of the scheme of classification tree method (models LCT/ACT), we call the
number of all finite vertices (certain sheets) of the scheme. For example, for the LCT on Fig. 1 the
power is 6.
  Obviously, the final power of the classification tree method scheme is also equal to the number of
                                                                     ,
all final paths in it. It is clear that with induction of n it can be easily proved that the final power of
each of the above schemes pn (predicates) is equal to n  1 . Indeed, the fact that the final power p1
which includes only one feature or algorithm (cases with LCT/ACT), equals 2, is obvious.
    Let the final power of the scheme pn equals to n  1 . Let us calculate the final power pn 1 . It is
clear that this scheme is based on the scheme pn , when in some final vertex, a new vertex is
successively added (feature, algorithm) with the number n  1 . Obviously, when adding this feature
(algorithm) to the scheme pn one end vertex disappears and two new end vertices are added.
Therefore, we can conclude that the number of all end vertices of the scheme pn equals to n  2 .
  Let us assume that on each n  th step of the classification tree construction procedure (of the
LCT model) the set Dn is weakly separated by some feature n . Next, we consider the scheme pn .
According to this scheme, it has n  1 of the final paths. Due to the fact that Dn at each step is
weakly separated, every such path contains at least one pair of the primary IS of general view (1). It is
also obvious that different end paths in pn do not have common pairs from the sample (1). Therefore,
we can conclude that the scheme (predicate) pn divides IS (based on the basic branching criterion
introduced by the current method of classification tree) in n  1 non-empty parts (subsets) that do not
intersect. Since in the primary IS in total there are m training pairs, the scheme pm 1 (or a predicate
with a smaller number) completely separates the primary IS, that is pm 1 fully recognizes the sample.
    Thereby, if on each n  th step the selected elementary feature n weakly separates the set Dn ,
then in this case the LCT construction process coincides with the primary IS and ends in no more than
in m  1 steps, where m  the number of all training pairs of primary IS.
    Let us notice that the condition of weak separation of classes for primary IS is relatively weak –
therefore, it provides low convergence of the construction procedure for the classification tree. Thus,
it is essential to consider the convergence of the process under more vital conditions. Therefore, we
will assume that we are dealing with a case where IS contains information about two classes (patterns)
H 0 and H1 , and IS itself has a deterministic nature. Let n j  is a number of training pairs
( xi , f R ( xi )) within primary IS, which satisfy the ratio f R ( xi )  j , ( j  0,1) , and for simplification
and certainty, we put it as n0  n1 .
    Having fixed f R ( x)  0 , we obtain some generalized feature (scheme) f 0 , which approximates
(in whole or in part) the primary IS. Obviously, in this case (that is in a situation where the choice of
any elementary feature has not yet been made n ), generalized feature (scheme) f 0 is the best
approximation of primary IS. Further the value n1 we call the unconditional number of errors in the
primary IS.
   Let in the first step of building a classification tree some elementary features has been selected
(arbitrarily) 1 – and this feature will break the initial sample into two parts (subsets) H 0 and H1 ,
where H j  is the set of all pairs ( xi , f R ( xi )) of primary IS, for which the ratio is satisfied
 f1 ( xi )  j , ( j  0,1) .
    Let nmj  the set of all pairs ( xi , f R ( xi )) from sample H j , ( j  0,1) , for which the relation is
fulfilled f R ( xi )  m, (m  0,1) . Feature 1 can be considered a generalized feature f1 (scheme),
which is built on the first step of the LCT construction process.
    Let us enter the value   max(n00 , n10 )  max(n01 , n11 ) , which is the number of correct answers
(classifications) that are realized by a generalized feature f1 , and accordingly, the value n0 is the
number of correct answers (classifications), which are realized by a generalized feature f 0 .
    By the number of correct answers we mean the number of those training pairs ( xi , f R ( xi )) in the
initial training sample of the type (1), for which the relation of equality is fulfilled f R ( xi )  f1 ( xi ) .
    Since n00  n01  n0 and n10  n11  n1 , then we will have the following:
                                  max(n00 , n10 )  max(n10 , n11 )  n0 .                            (2)
   Therefore, when choosing a feature 1 the number of correct answers at least does not decrease.
The number of errors given by the generalized algorithm f1 , will be equal to:
                                     m  n  n1  (  n0 )  n1 .                                      (3)
                                                                              n1
   Let us note that (3) follows from (2). Let us enter the value 1             and call it the quality of
                                                                             m
elementary feature 1 relating to primary IS, similarly we determine  n of feature n relating to
primary IS (n  1,2,3,...) .
    With the power of some constructed generalized feature (GF) or set GF (for a fixed step of ACT
scheme) we will call the number of training pairs ( xi , f R ( xi )) of primary IS that look like (1), which
is approximated (correctly classified) by this generalized feature (sequence of generalized features).
    It is important for ACT schemes that at step-by-step division of IS on two samples H 0 and H1
(and so on) part of the sample will be completely covered by the current classification algorithm
(generalized feature or their set) - that is, we will have a case of strong separation of array IS classes.
Therefore, we can assume that the complexity of the final ACT scheme (the total number of steps to
build a tree) largely depends on the procedure of initial evaluation and selection of a set of
independent classification algorithms ai , their initial parameters, parameters of set GF f i , which they
generate for each step of the ACT scheme.
    Then, for the ACT scheme, it is essential to consider the general complexity of the procedure for
constructing a classification tree in the condition of weak separation of the primary IS classes. A
single GF is generated with the power of one unit per verticle of the tree. Under high separation
conditions, when a problem and practical feasibility have not set the number limit on GF and its
power, it is possible to build them.
    In the first stage, we consider a case of weak class division with restrictions on the GF sets being
built by the ACT scheme. Let us note that the procedure for constructing an algorithmic tree has
certain features in terms of step-by-step approximation of the primary IS by sequence GF. Let at
every step of the construction of some ACT model, select for work one fixed classification algorithm
from a set of selected algorithms (a1 , a2 ,..., an ) . Moreover, the classification tree can be built by one
algorithm and sequence GF, which he is generating.
    Therefore, after fulfilling the n steps of the classification tree constructing procedure the ACT
structure represents some scheme sn (generalized second-order feature, which is built from a set of
GF synthesized by classification algorithms), which is the most effective approximation of the
primary IS of general view (1) a set of independent classification algorithms and their GF. In
particular sn will represent some tree-like scheme (GFT structure), which consists of n vertices, that
is, in the design of the scheme sn enters only n classification algorithms (GF – conditioned that for
each step of the tree constructing procedure no more than one generalized feature of the minimum
power per unit is generated) from the initial set.


5. Experiments and results

   In the next stage for the LCT structure let us make an assumption – quality  n of elementary
features n relative to the array of primary IS not less than some number y , where y  1 .
   Let us analyze the complexity of the procedure for classification tree construction under this
condition ( y  1) , to do this, let us estimate the number of steps for which this process (procedure)
will implement full recognition of the initial training sample array.
   To be certain, let us consider the following scheme of classification tree construction (Fig. 2).
   Let n1  an unconditional number of errors within primary IS. Elementary feature 11 separated IS
in two samples: H 0 and H1 . Let h0 and h1 accordingly is an unconditional number of errors in the
samples H 0 and H1 . Feature 12 separate the set H 0 in two sets H 00 and H 01 . Let h00 and h01 - is
an unconditional number of errors in the samples H 00 and H 01 . Similarly, we define sets H10 , H11
and quantities h10 and h11 for the elementary feature 32 .




Figure 2: Scheme of division into subsets in the structure of the classification tree.

   From the initial condition ( y  1) it follows:
                                                         1
                                               h0  h1  y * n1
                                              
                                                          1
                                              h00  h01  * h0 .                                              (4)
                                                          y
                                              h  h  *h  1
                                               10 11 y 1
                                              
    From (4) we receive the following:
                                                                      1
                                           h00  h01  h10  h11        * n1 .                                 (5)
                                                                      y2
   Let us make the following assumptions in this regard: h0  1 , h1  1 ,              h00  1 , h01  1 , h10  1
and h11  1 . From here we will have the following:
                                                      1             1
                                            21  * n1 , 2 2  2 * n1 .                                          (6)
                                                      y            y
   Similarly for the set of features 1i , i2 ,... , located on i  th tier of the logical tree, we will have:
                                                        1
                                                2i  i * n1 or (2 y ) i  n1 .                                  (7)
                                                        y
   Hence we can conclude that the process of constructing a classification tree will continue until
there the structure of will have m tiers (levels), where m has the following form:
                                                             log 2 n1
                                                  m  R(               ).                                      (8)
                                                           1  log 2 y
     Under R (x) means rounding the number x to the nearest integer, which exceeds x . For example
Q(1.2)  2, Q(3.7)  4, Q(4.1)  5 .
     Hence the classification tree, which has m full tiers (that is, the case when on i  th tier there are
vertices), has vertices – thus recognizing the primary IS with condition that ( y  1) by means of full
LCT takes place no more than in steps, where m is calculated using an expression (8).
     For the case of structure ACT, we can conclude that the sequence of constructed schemes
s1 , s2 ,..., s j (generalized features of the second order) coincides with the primary IS with the view
(1), not more than in M steps (where M  total power of primary IS), even if at each step only one
GF is generated, while the power of each is not more than one unit.
   Some classification algorithm that will be selected (fixed) on n  s th step in the procedure of
building the ACT model (to generate the appropriate GF), let us denote with an , and it is clear, that
this algorithm an corresponds to some scheme sn , which consists of algorithms a1 , a2 ,..., an 1 and
ends with this attribute (vertice of classification tree - ACT model). For example, on Fig. 3 some ACT
model is shown, in which a fixed scheme s2 (vertice of the classification tree under construction)
corresponds to the sequence of steps (schemes) {s1} , and scheme sM                          – sequential path
{s1 , s2 ,..., sM 1} .
    Therefore, for the ACT model it is possible to make the following conclusion: scheme sn (in the
structure of the classification tree) divides IS on n non-empty parts (subsets) that do not intersect,
and because in the primary IS in total there are M training pairs, that is why the scheme sM will
completely divide (approximates) the primary IS (that fully recognizes the sample at the condition
that it generates one GF at each step with the power of one unit). So, if on every n  th step of the
ACT construction scheme the generated GF (selected by classification algorithm an ) weakly
separates the set of primary IS, in this case, the process of classification tree construction coincides
with the primary IS and finishes not more than in M steps, where M  is the quantity of all training
pairs of primary IS. In the following stage of research, it is important to consider the case of strong
separation of classes in primary IS, when there are no set restrictions on algorithms ai regarding
generation GF (power of constructed GF is limited only by the practical possibility of the
classification algorithm itself ai and structural parameters of IS). Let with P ( f j ) we denote the total
power (approximation ability) of corresponding GF f j , (1  j  s ) , where s  is the quality of GF
in constructed ACT scheme.




Figure 3: Example of ACT structure with GF as vertices.

   Further on some step r , (1  r  M ) of the ACT scheme, a sequence of generalized features is
constructed f1 ,..., f r with their corresponding values P ( f i )  zi , where (1  z  M ) , (1  i  r ) ,
M  general power IS, and among them there are values                           z max and z min , which are for them,
respectively, the maximum and minimum (relatively to the current step of ACT scheme). Then in this
case the ACT scheme (model) will be constructed in t steps, where the value t is determined by the
ratio (9).
                                                   P ( IS )
                                                     pt                    2M
                                      t  2*                                      .                             (9)
                                               z   max
                                                          z   min
                                                                         z  z min
                                                                          max

   We notice that in the case when by the condition of the practical problem the power restrictions of
synthesized GF are imposed on the ACT scheme under construction (not exceeding the appropriate
value P ) – classification tree scheme (ACT model) will be built in t steps, where the value t is
determined by the ratio (10).
                                                           M
                                                     t      .                                                  (10)
                                                           P
   At strict restrictions of the ACT scheme on one generated GF (where according to the
condition P( f j )  1, (1  i  t ) , that is in case of weak classes separation of the current problem, the
classification tree scheme (model ACT) will be built in t steps, where the value t  M .
    In the next stage, for simplification, let us take the classification problems for which sets of
structures LCT/ACT have been built from researches [19-24,30]. The initial parameters of these
practical problems are presented in the Table 1. So in IS, the information regarding separation into
two classes has been represented. At the examination stage, the constructed classification system
should effectively recognize objects of unknown classification concerning these two classes. Let us
note that the training and test sample was automatically checked for correctness at the initial stage.
(searching and deleting the identical objects of different affiliations - errors of the first and second
type).
    Here (in Table 2) is represented the evaluation of the constructed structures of classification trees
(LCT/ACT) of problems from Table 1. To assess the quality of constructed classifiers (classification
schemes) we used an integrated feature of the quality of the classification tree QMain from resear [30].
Convergence of the structure synthesis LCT/ACT procedure is estimated on the basis of quantitative
features – the total number of iterations S Main and the number of tiers in the classification tree
structure LKol .

Table 1
Initial parameters of classification problems
             Type of             The         The      The total       Relation of objects of
          classification     dimensio      power     number of        different classes IS –
             problem          n of the    of data    classes by         ( H1 / H 2 / ... / H i )
                               feature      array       data
                             space N       of the    splitting IS
                                           primar        –l
                                            y IS –
                                            M
        The problem of          22         1250           2                   756/494
        geological data
      classification (Z1)
        The problem of          14         4863           6         823/648/1412/918/583/764
      chemical analysis
       of the quality of
          hydrocarbon
            fuels (Z2)
        The problem of          18         6118           3                76/108/5934
        classification of
        flood situations
       in the Tisza river
             basin of
       Transcarpathian
              region
               (Z3)
        The problem of          18         4252           3                73/102/4107
        classification of
        flood situations
        in the Uzh river
             basin of
       Transcarpathian
              region
      (observation post
            №1) (Z4)
        The problem of          18         4139           3                 68/97/3974
        classification of
        flood situations
        in the Uzh river
             basin of
       Transcarpathian
              region
      (observation post
            №2) (Z5)
   Integrated feature of the classification tree quality QMain displays the basic parameters
(characteristics) of classification trees and can be used as an optimality criterion in the evaluation
procedure of an arbitrary tree-like recognition scheme [20]. Let us note that the main idea of
classification tree methods based on autonomous algorithms in their structure lies in a step-by-step
approximation by the selected set of algorithms the data set of primary IS [22].

Table 2
   Comparative table of structure classification schemes LCT/ACT
         № of            Method of       Integral feature of Convergence          Number of
        problem         synthesis of        model quality         of the            tiers in
                       classification           QMain        classification        structure
                      tree structure                               tree            LCT/ACT
                                                                structure             LKol
                                                              (number of
                                                               iterations)
                                                                    S Main
            Z1            Method of
                       complete LCT           0,004789               79                22
                        based on the
                         selection of
                         elementary
                           features
            Z1        LCT model with
                         a one‐time           0,002263              102                16
                       assessment of
                      the importance
                      of the features
            Z2           Limited LCT
                        construction          0,003244               91                17
                           method
            Z2           Algorithmic
                        tree method           0,005119               46                9
                            (type I)
            Z3           Algorithmic
                        tree method           0,002941               72                15
                           (type ІІ)
            Z3            Method of
                          extensive           0,003612               84                13
                         selection of
                      features (step‐
                            by‐step
                        assessment)
            Z4        Algorithm tree
                            (type I)          0,005054               43                10

            Z5        Algorithm tree
                         (type II)            0,002813               75                16


   The obtained structures of classification trees (ACT/LCT models) are from one side are
characterized by high versatility in terms of practical problems and relatively compact structure of the
model itself, but from the other side it requires significant hardware costs for storing generalized
features and initial assessment of the fixed classification algorithms quality according to data of IS
comparing to the neural network concept [31-35]. Therefore in comparison with the ACT concept
LCT method has a high speed of classification schemes, relatively insignificant hardware costs for
storage and operation of the tree structure itself, and high quality of discrete objects classification.
    Let us note that all ACT / LCT models and ACT / LCT structures were built in the software
application "Orion III". Based on the classification tree method and the principle of modularity,
Uzhhorod National University has developed a software application "Orion III" to generate
autonomous recognition systems. The algorithmic library of the system has 15 recognition algorithms,
including algorithmic implementations of both the LCT structure and the ACT structure [23].
    From Table 2 we can see that the methods of algorithm trees (of two types) have shown a high rate
of convergence for constructing the classification tree structure on represented IS compared to the
LCT schemes. Consider that the first type of the ACT structure shows a good result in terms of
structural complexity (number of tiers, vertices, generalized features) of constructed classification
model in comparison with logical classification trees and the tree of algorithms of the second type. In
general, we can conclude about the rapid convergence of ACT structures compared to LCT models
and advantage due to this the structural complexity of the constructed classification tree and the
informational capacity of generalized feature sets.

6. Conclusion
   Therefore, taking into consideration all the above in the research, we can assume the following
points:
   On the condition of weak class separation in the LCT case, if on every n  th step the selected
elementary feature n weakly separates the set (subset) of objects of primary IS, then, in this case,
the process of classification tree construction coincides relating to the primary IS and terminates no
more than in m  1 steps, where m  the quantity of all training pairs of primary IS.
   Classification tree (of LCT structure) on condition of strong classes separation of primary IS sets
of objects, which has m full tiers, levels (that is the case, when on i  th tier vertices are located),
has vertices – so array recognition of primary IS on condition that ( y  1) by means of full LCT
                                                                                               log 2 n1
takes place no more than in steps, where m is calculated by means of expression m  R (                  ).
                                                                                             1  log 2 y
    A general number of all final vertices of logical structure (sheets of recognition tree) of
constructed classification scheme will unambiguously determine the final power of classification tree
method scheme (models LCT/ACT).
    Power of some GF (the set of constructed GF) for the fixed step of the ACT method scheme is
considered the general quantity of training pairs ( xi , f R ( xi )) of primary IS (a subset of primary IS)
that look like (1), which are approximated (correctly classified) by given generalized feature
(sequence of generalized features).
    In case of weak class separation of primary IS for the ACT scheme the process of classification
tree construction coincides relating to IS data set and terminates in no more than M steps, where
 M  is the quantity of all primary IS training pairs.
    In the case of strong classes, separation of primary IS for the ACT scheme, when the power of
constructed GF (or set of GF) is limited only by the practical possibility of the classification algorithm
itself ai and initial parameters of IS, the ACT scheme (model) will be constructed in t steps, where
the value t is defined by the ratio (9).
7. Acknowledgments
   The paper was carried out within the framework of the research "Modeling and forecasting of
emergency situations in the Carpathian region and countries of Central and Eastern Europe", the state
registration number of the work is 0106V00285, the category of work is fundamental research (ID-
2201020), 01 – Fundamental research on the most important problems of natural, social and
humanitarian sciences.

8. References

    [1] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, Berlin, Springer,
2008.
    [2] J.R. Quinlan, Induction of Decision Trees, Machine Learning 1 (1986) 81-10.
    [3] L.L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and regression trees,
Boca Raton, Chapman and Hall/CRC, 1984.
    [4] M. Lupei, A. Mitsa, V. Repariuk, V. Sharkan, Identification of authorship of Ukrainian-
language texts of journalistic style using neural networks, Eastern-European Journal of Enterprise
Technologies 1 (2 (103)) (2020) 30-36. doi: 10.15587/1729-4061.2020.195041.
    [5] M. Jafari, H. Xu, Intelligent Control for Unmanned Aerial Systems with System Uncertainties
and Disturbances Using Artificial Neural Network, Drones 3 (2018) 24-36.
    [6] M. Miyakawa, Criteria for selecting a variable in the construction of efficient decision trees,
IEEE Transactions on Computers 38(1) (1989) 130-141.
    [7] H. Koskimaki, I. Juutilainen, P. Laurinen, J. Roning, Two-level clustering approach to training
data instance selection: a case study for the steel industry, in: Proceedings of the International Joint
Conference on Neural Networks, IJCNN 2008, part of the IEEE World Congress on Computational
Intelligence, IEEE, Los Alamitos. (2008) 3044-3049. doi: 10.1109/ijcnn.2008.4634228.
    [8] S.F. Jaman, M. Anshari, Facebook as marketing tools for organizations: Knowledge
management analysis, in: Dynamic perspectives on globalization and sustainable business in Asia, IGI
Global, Hershey. (2019) 92-105. doi: 10.4018/978-1-5225-7095-0.ch007.
    [9] W. Sirichotedumrong, T. Maekawa, Y. Kinoshita, H. Kiya. Privacy-preserving deep neural
networks with pixel-based image encryption considering data augmentation in the encrypted domain,
in: Proceedings of the 26th IEEE International Conference on Image Processing, ICIP 2019, IEEE,
Los Alamitos. (2019) 674-678. doi: 10.48550/arXiv.1905.01827.
    [10] R.L. De Mántaras, A distance-based attribute selection measure for decision tree induction,
Machine learning 6(1) (1991) 81-92.
    [11] K. Karimi, H. Hamilton, Generation and Interpretation of Temporal Decision Rules,
International Journal of Computer Information Systems and Industrial Management Applications 3
(2011) 314-323.
    [12] B. Kamiński, M. Jakubczyk, P. Szufel, A framework for sensitivity analysis of decision trees,
Central European Journal of Operations Research 26 (1) (2017) 135-159.
    [13] H. Deng, G. Runger, E. Tuv, Bias of importance measures for multi-valued attributes and
solutions, in: Proceedings of the 21st International Conference on Artificial Neural Networks, volume
2 of ICANN 2011, Springer-Verlag, Berlin. (2011) 293-300. doi: 0.1007/978-3-642-21738-8_38.
    [14] S.A. Subbotin, Construction of decision trees for the case of low-information features, Radio
Electronics, Computer Science, Control 1 (2019) 121-130.
    [15] M. Jafari, H. Xu. Intelligent Control for Unmanned Aerial Systems with System Uncertainties
and Disturbances Using Artificial Neural Network. Drones. 3 (2018) 24–36.
    [16] A. Painsky, S. Rosset, Cross-validated variable selection in tree-based methods improves
predictive performance, IEEE Transactions on Pattern Analysis and Machine Intelligence 39(11)
(2017) 2142-2153. doi:10.1109/tpami.2016.2636831.
    [17] D. Imamovic, E. Babovic, N. Bijedic, Prediction of mortality in patients with cardiovascular
disease using data mining methods, in: Proceedings of the 19th International Symposium INFOTEH-
JAHORINA, INFOTEH 2020, IEEE, Los Alamitos. (2020) 1-4.
    [18] S.B. Kotsiantis, Supervised Machine Learning: A Review of Classification Techniques,
Informatica 31 (2007) 249-268.
    [19] Y.I. Zhuravlev, V.V. Nikiforov, Recognition algorithms based on the calculation of
estimates, Cybernetics 3 (1971) 1-11.
    [20] Y.A. Vasilenko, E.Y. Vasilenko, I.F. Povkhan, Branched feature selection method in
mathematical modeling of multi-level image recognition systems, Artificial Intelligence 7 (2003) 246-
249.
    [21] I. Povkhan, A constrained method of constructing the logic classification trees on the basis of
elementary attribute selection, CEUR Workshop Proceedings, Proceedings of the Second
International Workshop on Computer Modeling and Intelligent Systems (CMIS-2020), 2608 (2020)
843-857.
    [22] Y.A. Vasilenko, E.Y. Vasilenko, I.F. Povkhan, Conceptual basis of image recognition
systems based on the branched feature selection method, European Journal of Enterprise
Technologies 7(1) (2004) 13-15.
    [23] I. Povkhan, M. Lupei, The algorithmic classification trees, in: Proceedings of the IEEE Third
International Conference on Data Stream Mining & Processing, DSMP 2020, IEEE, Los Alamitos,
(2020) 37-44.
    [24] I. Povkhan, M. Lupei, M. Kliap, V. Laver, The issue of efficient generation of generalized
features in algorithmic classification tree methods, , in: Proceedings of the International Conference
on Data Stream Mining and Processing, DSMP 2020, Springer, Cham. (2020) 98-113.
    [25] I. Povkhan, Classification models of flood-related events based on algorithmic trees, Eastern-
European Journal of Enterprise Technologies 6(4) (2020) 58-68. doi: https://doi.org/10.15587/1729-
4061.2020.219525.
    [26] J. Rabcan, V. Levashenko, E. Zaitseva, M. Kvassay, S. Subbotin, Application of Fuzzy
Decision Tree for Signal Classification, IEEE Transactions on Industrial Informatics 15(10) (2019)
5425-5434. doi: 10.1109/tii.2019.2904845.
    [27] P.E. Utgoff, Incremental induction of decision trees, Machine learning 4(2) (1989) 161-186.
doi:10.1023/A:1022699900025.
    [28] L. Hyafil, R.L. Rivest, Constructing optimal binary decision trees is NP-complete,
Information Processing Letters 5(1) (1976) 15-17.
    [29] H. Wang, M. Hong, Online ad effectiveness evaluation with a two-stage method using a
Gaussian filter and decision tree approach, Electronic Commerce Research and Applications 1(35)
(2019) Article 100852. doi: 10.1016/j.elerap.2019.100852.
    [30] I.L. Kaftannikov, A.V. Parasich, Decision Tree's Features of Application in Classification
Problems, Bulletin of the South Ural State University, Ser. Computer Technologies, Automatic
Control, Radio Electronics 15(3) (2015) 26-32. doi: 10.14529/ctcr150304.
    [31] I.F. Povhan, Logical recognition tree construction on the basis a step-to-step elementary
attribute selection, Radio Electronics, Computer Science, Control 2 (2020) 95-106.
    [32] S.L. Lin, H.W. Huang, Improving Deep Learning for Forecasting Accuracy in Financial Data,
Discrete Dynamics in Nature and Society Volume 2020 (2020) Article ID 5803407. doi:
10.1155/2020/5803407.
    [33] R. Srikant, R. Agrawal, Mining generalized association rules, Future Generation Computer
Systems,13(2) (1997) 161-180.
    [34] Ameisen E, Building Machine Learning Powered Applications: Going from Idea to Product,
O'Reilly Media, California. (2020).
    [35] M. Sewak, Deep Reinforcement Learning, Frontiers of Artificial Intelligence, Springer,
Berlin, 2020. doi: 10.1007/978-981-13-8285-7.
    [36] J.B. Harley, D. Sparkman, Machine learning and NDE: Past, present, and future, AIP
Conference Proceedings, 2102(1) (2019). doi:10.1063/1.5099819.