<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Classical and Quantum Improvements of Generic Decision Tree Constructing Algorithm for Classification Problem</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kamil Khadiev</string-name>
          <email>kamilhadi@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ilnaz Mannapov</string-name>
          <email>ilnaztatar5@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liliya Safina</string-name>
          <email>liliasafina94@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kazan Federal University</institution>
          ,
          <addr-line>18 Kremlyovskaya street , Kazan, 420008</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Zavoisky Physical-Technical Institute, FRC Kazan Scientific Center of RAS</institution>
          ,
          <addr-line>10/7, Sibirsky tract, Kazan, 420029</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>83</fpage>
      <lpage>93</lpage>
      <abstract>
        <p>In the work, we focus on the complexity of the generic of a decision tree classifier constructing algorithm. The decision tree is constructed in ( ( running time in the classical case, where is a class numbers, is the input data size, is an attributes number, is a tree height. We offer two options for improving the classical version of the generic algorithm, the running time of using these options are ( (general case) and ( (for independent attributes). After that we suggest a quantum improvement, which uses quantum subroutines like the amplitude amplification and the DȕrrHøyer minimum search algorithms. The running time of the quantum algorithms is ( √ ( ) that is better than the complexity of the classical algorithm in the general case.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Quantum machine learning</kwd>
        <kwd>quantum decision trees</kwd>
        <kwd>decision tree constructing</kwd>
        <kwd>classification problem</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Presented quantum algorithm uses combination of known quantum algorithms and classical
computing. The Dȕrr-Høyer algorithm presented as a subroutine. We use query model algorithms as
quantum algorithms. It is based on making a query to a black box which can access to the input. A
running time is a number of the black box queries [
        <xref ref-type="bibr" rid="ref2 ref4">3, 4, 6</xref>
        ].
      </p>
      <p>In Section 2 we set preliminaries. Section 3 contains the description of the classical version of
algorithms. We provide the classical improvements in Section 4 and the new quantum algorithm for
decision tree constructing in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <p>Let be a training data set and be a set of corresponding
classes. One element from is a vector of attributes, where , is a
number of attributes, is the set of attributes, is the size of , is a size of the training data set. Let
us consider some element . An attribute is a real-valued variable or a categorical variable.
Let if is a real value, where or is the number of the partition of training set; and
if is a categorical attribute, i.e. for some integer . Let be
the value with index of attribute , where for categorical attribute and for
realvalued attribute. Let be an index of a class of , where is a number of classes.</p>
      <p>Let be a subset from training set which elements are satisfying to the restrictions, which
defined by a predicate . For example, is the subset from that belongs to the class with
number .</p>
      <p>The problem is to construct a function that is called classifier. The
function classifies a new vector ( . Let , be the notations to define the set of
indexes of categorical and real-valued attributes respectively.</p>
    </sec>
    <sec id="sec-3">
      <title>3. The Observation of Generic Algorithm 3.1.</title>
    </sec>
    <sec id="sec-4">
      <title>The Tree Structure</title>
      <p>Decision tree constructing algorithms use a method “divide and conquer” to build a suitable tree. If
all vectors in belong to (each vector from belongs to the same class), then the process of a
decision tree constructing is stopped, and a leaf is labeled by . In other case, let be some test (with
outcomes ) that creates a partition of . A partition is the a of vectors from , it
corresponds to .</p>
      <p>We consider tests of two types. If is a categorical attribute from , then a test is
with outcomes, one for each value from .</p>
      <p>If is a real-valued attribute, then a test is with two value options: and '. Here
is a value of threshold.</p>
      <p>To construct a tree classifier, we consider a structure that has next fields:
 is a condition field which indicates that node is a leaf or not;
 is an attribute index for non-leaf nodes and a class index for leaf nodes;
 is an array of key-valued pairs (defined as ( , where is a predicate and
is a corresponding subtree). If an attribute is categorical, then a size of equal
to the count of attribute values. On real-valued attributes, contains two items.</p>
      <p>Let (Algorithm 1) be the main recursive function of the decision tree constructing
process. On each call, the procedure creates a current node and checks of necessity to stop the
construction process.</p>
    </sec>
    <sec id="sec-5">
      <title>Test Selection Procedure</title>
      <p>
        To maximize a heuristic splitting criterion the generic decision tree constructing algorithm uses a
greedy search for selecting a candidate test. In most decision trees inducers, an internal node is split
according to the value of a single attribute. The inducer searches for the best attribute upon which to
perform the split. There are various univariate criteria which can be characterized in different ways,
such as: according to the origin of the measure (Information Theory, Dependence, and Distance);
according to the measure structure (Impurity-based criteria, Normalized Impurity-based criteria, and
Binary criteria [
        <xref ref-type="bibr" rid="ref19">22</xref>
        ]). Let us briefly review them.
      </p>
      <p>Impurity-based Criteria. Let be a random variable with values, distributed according to
( , a function is an impurity measure that satisfies the following
conditions:
 ( .
 ( is the minimum if exists such that component .



(
(
(
is the maximum if for all ,
, the following condition is true:
.
is symmetric with respect to components of .
is smooth (differentiable everywhere) in its range.
defined as:
(</p>
      <p>|
(
|</p>
      <p>|
| |</p>
      <p>| |</p>
      <p>The goodness-of-split due to selected attribute is defined as a reduction in the impurity of the
target attribute after splitting according to the values :
|
(1)
The probability vector (from a given training set
) of the set of corresponding classes
is
|
).</p>
      <p>(
(
∑
| |
| |
|</p>
      <p>Normalized Impurity-based Criteria are normalized variants of usual Impurity-based Criteria.
Sometimes it is useful to “normalize” the impurity-based measures. The famous decision tree
constructing algorithms such as ID3, C4.5, C5.0, CART use impurity based-criteria, and normalized
impurity-based criteria.</p>
      <p>Binary Criteria are used to build binary decision trees. These measures are based on a division of
the input attribute domain into two subdomains.</p>
      <p>In this work, we consider impurity-based and normalized impurity-based criteria.</p>
      <p>Let us provide some information about usage of impurity based criteria and applying our
improvements in practical cases. The decision tree model with impurity-based and normalized
impurity-based criteria is used in the industry. Such algorithms as ID3, C4.5, C5.0, CART are used in
well-known data processing frameworks, PC programs, and other machine learning algorithms as
subroutines.</p>
      <p>ID3, CART use impurity-based criteria, C4.5, C5.0 use normalized impurity based criteria.
Criteria of ID3, C4.5, and C5.0 are based on the Entropy criterion. CART uses the Gini Index as the
impurity criterion. ( for CART is defined by the formula:</p>
      <p>For ID3, C4.5, and C5.0
( (
) is defined by the next formula:
(
∑ (
|
|
| |
|
)
|
(
∑
|
| |
| |
|
3.3.</p>
    </sec>
    <sec id="sec-6">
      <title>Attributes Processing</title>
      <p>
        This subsection is based on the open-sourced code of C5.0 [
        <xref ref-type="bibr" rid="ref1">2</xref>
        ].
      </p>
      <p>The function uses abstract impurity-based criterion constructed by Formula 1. It
calculates a reduction of impurity after a split. Categorical and real-valued attributes are processed
differently. It checks what kind of the input attribute and calls (Algorithm 4) or
(Algorithm 3).</p>
      <p>The arguments of ( is an index of the processed attribute and a
training set . It returns a triple ( . Here is a maximal impurity reduction
value, is a set of subsets from training set splitting by selected attribute, is selected
threshold value. The variables and are used only for real-valued attributes.</p>
      <p>For considering this process we have to describe an abstract impurity based function. It can be
described with the next formula:
(</p>
      <p>∑ ̃ (</p>
      <p>Note that is some function with ( running time. It is specific for any impurity-based
criterion. Formula 2 is needed for analyzing the running time of this algorithm.</p>
      <p>Let us provide some detailed information about processing of a real-valued attribute. Firstly, the
algorithm sorts a subset by . It is made by the procedure ( . Note that the
indexes in a result sorted order are ( , where | |. Now we can split vectors by
(2)
). After then there are two sets
{
} and
{
}, for
.</p>
      <p>The second step is computing a number of elements corresponding to each class.</p>
      <p>Let [| | | |] be a sequence that contains object numbers calculated for
is a number of vectors
such that
for
,
Let</p>
      <p>|] be a sequence that contains object numbers calculated for
reversed
,
is a number of vectors
such that
for
Let ({ }) and ({ }), where ,
, . The value is used for pre-counting of an impurity
for any threshold. is used for pre-counting of an impurity for any threshold from the back side
of the training set. These values are calculated using Formula 2. The value is a prefix and
is a suffix, result value is .</p>
      <p>It is made by these formulas:
|
,
|.</p>
      <p>|
∑ ̃ (
∑ ̃ (</p>
      <p>The last step is choosing a maximum
(
. As result we get
and (
.</p>
      <p>)
)</p>
      <p>(
and
, where
(</p>
      <p>(
, (
|.</p>
      <p>,
,</p>
      <p>Let us describe processing of a categorical attribute from . We split all elements of according
to the attribute value. After that we can compute the value of the objective function. So
, for . All vectors of are processed one by one.</p>
      <p>Let us consider the processing of current -th vector such that and . Let us
describe the variables used in the processing of categorical attributes. Let be a size of ;
| | be a count of elements from that belongs to the class ; | |
be a number of vectors from that belongs to the -th class; be a notation of impurity value
( ; be an impurity of .</p>
      <p>These variables contain values after processing -th vector and contain
values before processing -th vector. The final values of the variables will be after processing all
| | variables. We recalculate each variable according to the formulas (only variables that depend
on and are changed):
( ̃(
)
̃(</p>
      <p>))
( ̃( ) ̃( ))
In the end, the procedure computes an impurity reduction by Formula 1. Finally, we obtain the
procedure from Algorithm 4.
3.4.</p>
    </sec>
    <sec id="sec-7">
      <title>Running Time of the Generic Tree Constructing Algorithm</title>
      <p>Remind, that is a set of indexes of numeric attributes (real-valued attributes) and
indexes of categorical attributes.</p>
      <sec id="sec-7-1">
        <title>Theorem 1</title>
        <p>The running time of the generic tree constructing algorithm is ( (
Appendix A).</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>4. Improvement of the Classical Algorithm</title>
      <p>Let us discuss an approach used for a classical improvement.</p>
    </sec>
    <sec id="sec-9">
      <title>4.1.A Fast Tree-Growing Algorithm</title>
      <p>
        We consider a Fast Tree-Growing Algorithm [
        <xref ref-type="bibr" rid="ref22">25</xref>
        ] for the Classical Generic Decision Tree
Constructing algorithm. It is based on attribute independence assumption. This approach cannot be
applied to all cases of classification problems. On the other hand, many practical cases can be solved
faster because of the assumption of attribute independence. Remind that the key moment of decision
tree constructing algorithms with impurity based criteria is information gain calculation. It is
evaluated by the Formula 1.
      </p>
      <p>Let consider ( for CART and ID3-family that is defined by the formulas: (
is a set of
. (See
∑ (
(
and
(
∑
(
( , where
For training set partition
we can define
(
(
|
|
|
| |
|
.
|
|
.</p>
      <p>The tree-growing process is a recursive process of splitting of the training data. Let be the
training data associated with the considered node. Let us to make another view to the problem. The
value ( actually can be replaced by conditional probability ( | ) on the input training data,
where
and
is an assignment of values to the variables in
is the set of attributes along the path from the current node to the root, called path attributes,
. Similarly, ( is ( | ) on
the entire training data.</p>
      <p>In the process of tree-growing each candidate attribute (the attributes not in ) is evaluated using
Equation 1, and the one with the highest information gain is selected as the attribute for splitting. The
most time-consuming part in this process is evaluating ( | ) for computing ( . It
must pass through each instance in</p>
      <p>, for each of which it iterates through each candidate
attribute . This results in a running time of (| | . The union of the subsets on each level of
the tree is the input data set that has a size equals to , and the running time for each level is (
. Therefore, the classical decision-tree learning algorithm has a running time of ( ,
where is a height of tree or a count of levels.</p>
      <p>The key observation is the ability to skip of passing through
estimate ( | ). According to probability theory, we have
( | ) ( | ) ( |</p>
      <p>( | ) ∑ ( | ) ( | )</p>
      <p>Suppose, that each candidate attribute is independent of the path attribute assignment
class, i.e., ( | ) ( | ).</p>
      <p>Then we have
for each candidate attribute to
) (</p>
      <p>
        According to the paper [
        <xref ref-type="bibr" rid="ref22">25</xref>
        ], the information gain calculated by Equations 3 and 1 is called
independent information gain ( ). Note that in Equation 3, ( | ) is the percentage of instances
and class number on the entire training data that can be precomputed and stored with a running
time of ( before the tree-growing process with an additional space increase of ( , and
( | ) is the percentage of instances belonging to class in that can be computed by passing
through once taking (| | . Thus, at each level, the running time for computing ( | )
using Equation 3 is ( .
      </p>
      <p>The value
in Equation 1 should be computed for computing
. If we examine the
| |
partition for each candidate attribute
, the corresponding running time would be
(
.</p>
      <p>Fortunately,
can be approximated by ∑
( |
) (
| ) taking
( .</p>
      <p>| |</p>
      <p>The running time for selecting the splitting attribute using is similar to using information gain
in C4.5. ( should be computed for each candidate attribute, it takes ( for each node.
The total running time for splitting attribute selection on the entire tree is ( , where is the
number of internal nodes on the tree. Note that depends on (height of the tree), and it is a
parameter of the algorithm. Note can be bounded by , because a number of rules in the tree cannot
be more than a size of a training set. Thus, the total running time is ( .</p>
      <p>The total time for tree-growing is the sum of the time for probability estimation, partition, and
splitting attribute selection. As result, the running time for tree-growing using is ( .</p>
      <p>Note that in the Algorithm 5, we do not cope with real-valued attributes for simplicity, we process
real-valued attributes in the following way. In preprocessing, all real-valued attributes are discretized
by -bin discretization, where √ .</p>
      <p>Note, that the splitting attribute real-valued attributes are treated the same as categorical attributes
in selecting process.</p>
      <p>Once a real-valued attribute is chosen, a splitting point is found using the same way as in C4.5.
Note that a real-valued attribute could be chosen again in the attribute selection on descendant nodes.
For processing real-valued attributes this algorithm uses additional time as a classical version of
generic decision tree constructing algorithm (see Algorithm 3). In particular, we need to sort data set
for selecting thresholds and splitting data by selected real-valued attribute.</p>
    </sec>
    <sec id="sec-10">
      <title>4.2.Using a Self-balancing Binary Search Tree</title>
      <p>
        We use such data structure as a self-balancing binary search tree to store and . As a
self-balancing binary search tree, we can use the Red-Black tree [12] or the AVL tree [
        <xref ref-type="bibr" rid="ref3">5</xref>
        ]. A
selfbalancing binary search tree contains only indexes with a non-zero value, and other values are zero.
The running time of adding a new index (key) to the data structure is ( , where is a number
of indexes with non-zero values. The running time of removing and inserting is the same. The running
time of removing all indexes from the data structure is ( .
      </p>
      <sec id="sec-10-1">
        <title>Theorem 2</title>
        <p>The running time of generic decision tree constructing algorithm that uses a Self-balancing binary
search tree is ( . (See Appendix B).</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>5. Quantum Improvement</title>
      <p>
        We use the Dȕrr-Høyer's algorithm for maximum search [
        <xref ref-type="bibr" rid="ref8">10</xref>
        ] and modification of Grover's search
algorithm [
        <xref ref-type="bibr" rid="ref5">7</xref>
        ]. This quantum algorithm help us to speed up decision tree building process.
      </p>
      <sec id="sec-11-1">
        <title>Lemma 1</title>
        <p>Let function be a function that the running time of computing ( is ( . A
quantum algorithm can be constructed that finds argument of maximal ( , the expected running
time of the algorithm is (√
(</p>
        <p>) and the success probability is at least .</p>
        <p>Using this lemma we can replace the maximum search in function and use
as a function . We call the . For reducing an error probability, we
repeat the maximum finding process times. After that we choose the best solution. The
procedure is bellow (Algorithm 6).</p>
      </sec>
      <sec id="sec-11-2">
        <title>Theorem 3</title>
        <p>The running time of the quantum algorithms is (
(
√
). The success probability
of the quantum algorithms is
(
), where is a number of inner nodes of a tree. (See Appendix
C).</p>
      </sec>
    </sec>
    <sec id="sec-12">
      <title>6. Conclusion</title>
      <p>We suggest a version of the generic decision tree constructing algorithm with a self-balancing tree
which works faster than known classical algorithms. After that, we have presented the quantum
version of the generic decision tree constructing algorithm for classification problem. Our algorithm
works in ( ( √ ) versus ( ( in classical generic case.</p>
    </sec>
    <sec id="sec-13">
      <title>7. Acknowledgements</title>
      <p>A part of the reported study was funded by RFBR according to the research project
No.20-3770080. The research from Section 4.1 is funded by the subsidy allocated to Kazan Federal University
for the state assignment in the sphere of scientific activities, project No. 0671-2020-0065.
8. References
[1]. Decision trees. https://scikit-learn.org/stable/modules/tree.html.</p>
    </sec>
    <sec id="sec-14">
      <title>9. Appendix</title>
    </sec>
    <sec id="sec-15">
      <title>A The Proof of Theorem 1</title>
      <sec id="sec-15-1">
        <title>Theorem 1</title>
        <p>The running time of the generic tree constructing algorithm is (</p>
      </sec>
      <sec id="sec-15-2">
        <title>Proof</title>
        <p>(
.</p>
        <p>The subroutine takes the main time. That is why we focus on analyzing this
procedure.</p>
        <p>The running time for computing element counts by classes for real-valued attributes is (| | .
The running time for the subroutine is (| | | | . The running time of computing the
best reduction for one threshold is ( . The running time of calculating the best reduction for all
thresholds is (| | . Additionally, we should initialize array that takes ( . The total
complexity of this processing a real-valued attribute is ( | | | | .</p>
        <p>Let us consider a discrete-valued attribute. The running time of cases processing is (| | . An
impurity reduction ( for some discrete attribute is calculated with ( running time,
where is a number of attribute values. An impurity before cutting ( is calculated with
( running time, an impurity after cutting is calculated in ( . Therefore, the running time of
processing of one discrete-valued attribute is (| | .</p>
        <p>Note that if we consider all sets of one level of the decision tree, then we collect all elements of
. Therefore, the total complexity for one level is ( ( , and the total complexity for
the whole tree is ( ( .</p>
      </sec>
    </sec>
    <sec id="sec-16">
      <title>B The Proof of Theorem 2</title>
      <sec id="sec-16-1">
        <title>Theorem 2</title>
        <p>The running time of generic decision tree constructing algorithm which is based on Self-balancing
binary search tree is ( .</p>
      </sec>
      <sec id="sec-16-2">
        <title>Proof</title>
        <p>The proof of this theorem is followed from the proof of Theorem 1. On calculating the values
and an algorithm should reassign the unchanged values for every class on each new
object processing, then this procedure takes (| | steps. With this improvement, we can skip this
reassigning operations and the running time for processing a real-valued attribute becomes
( = ( , and for a discrete-valued attribute, it is ( because we
process each vector one by one and recompute variables that take only ( steps for updating
values of and ( steps for other actions. Therefore, the total complexity is ( .</p>
      </sec>
    </sec>
    <sec id="sec-17">
      <title>C The Proof of Theorem 3</title>
      <sec id="sec-17-1">
        <title>Theorem 3</title>
        <p>The running time of the quantum algorithms is
(
(
√
) . The success
probability of the quantum algorithms is
(
), where
is a number of inner nodes (not leaves).</p>
      </sec>
      <sec id="sec-17-2">
        <title>Proof</title>
        <p>The running time of
searching is (√ |
| | | | (
(
(
√
|
).</p>
        <p>is (| | | | . So the running time of maximum
| |). With repeating the algorithm, the running time is (√
) . If we sum the running time for all nodes, then we obtain
The success probability of the Dȕrr-Høyer's algorithm is . We call it (
times and
choose a maximum among
(</p>
        <p>values of gain ratios. Then, we find a correct attribute for one
node with a success probability
(
)
(
). We should find correct attributes for
all nodes except leaves. Thus, the success probability for the whole tree is equal to
((
) )
(
), where
is a number of internal nodes (not leaves).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[2]. C5</source>
          .
          <article-title>0: An informal tutorial (</article-title>
          <year>2019</year>
          ), url=https://www.rulequest.com/see5-unix.
          <source>html [3]</source>
          <string-name>
            <given-names>. F.</given-names>
            <surname>Ablayev</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ablayev</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            <given-names>J</given-names>
          </string-name>
          .,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khadiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Salikhova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>On quantum methods for machine learning problems part i: Quantum tools</article-title>
          .
          <source>Big Data Mining and Analytics</source>
          pp.
          <fpage>41</fpage>
          -
          <lpage>55</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [4
          <string-name>
            <given-names>]. F.</given-names>
            <surname>Ablayev</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ablayev</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            <given-names>J</given-names>
          </string-name>
          .,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khadiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Salikhova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>On quantum methods for machine learning problems part ii: Quantum classification algorithm</article-title>
          .
          <source>Big Data Mining and Analytics</source>
          pp.
          <fpage>56</fpage>
          -
          <lpage>67</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [5]. G. M.
          <article-title>Adel'son-Vel'skii and</article-title>
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Landis</surname>
          </string-name>
          .
          <article-title>An algorithm for organization of information</article-title>
          .
          <source>In Doklady Akademii Nauk</source>
          , volume
          <volume>146</volume>
          , pages
          <fpage>263</fpage>
          -
          <lpage>266</lpage>
          . Russian Academy of Sciences,
          <year>1962</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>. A.</given-names>
            <surname>Ambainis</surname>
          </string-name>
          .
          <article-title>Understanding quantum algorithms via query complexity</article-title>
          .
          <source>arXiv:1712.06349</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [7]. G. Brassard,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mosca</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Tapp</surname>
          </string-name>
          .
          <article-title>Quantum amplitude amplification and estimation</article-title>
          .
          <source>Contemporary Mathematics</source>
          ,
          <volume>305</volume>
          :
          <fpage>53</fpage>
          -
          <lpage>74</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [8]. T. H Cormen,
          <string-name>
            <given-names>C. E</given-names>
            <surname>Leiserson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L</given-names>
            <surname>Rivest</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Stein</surname>
          </string-name>
          . Introduction to Algorithms. McGrawHill,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [9]. Ronald De Wolf.
          <article-title>Quantum computing and communication complexity</article-title>
          .
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [10].
          <string-name>
            <given-names>C.</given-names>
            <surname>Durr</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Hoyer</surname>
          </string-name>
          .
          <article-title>A quantum algorithm for finding the minimum</article-title>
          .
          <source>arXiv:quant-ph/9607014</source>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [11].L.
          <string-name>
            <surname>Grover</surname>
          </string-name>
          ,
          <article-title>A fast quantum mechanical algorithm for database search</article-title>
          .
          <source>In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing</source>
          , pp.
          <fpage>212</fpage>
          -
          <lpage>219</lpage>
          (
          <year>1996</year>
          ) [
          <volume>12</volume>
          ]
          <string-name>
            <surname>.L. J Guibas</surname>
            and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Sedgewick</surname>
          </string-name>
          ,
          <article-title>A dichromatic framework for balanced trees</article-title>
          .
          <source>In Proceedings of SFCS 1978</source>
          , pages
          <fpage>8</fpage>
          -
          <lpage>21</lpage>
          . IEEE,
          <year>1978</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [13].
          <string-name>
            <given-names>S.</given-names>
            <surname>Jordan</surname>
          </string-name>
          ,
          <article-title>Bounded error quantum algorithms zoo</article-title>
          . https://math.nist.gov/quantum/zoo.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [14].
          <string-name>
            <given-names>K.</given-names>
            <surname>Khadiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kravchenko</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Serov</surname>
          </string-name>
          ,
          <article-title>On the quantum and classical complexity of solving subtraction games</article-title>
          .
          <source>In Proceedings of CSR</source>
          <year>2019</year>
          , volume
          <volume>11532</volume>
          <source>of LNCS</source>
          , pages
          <fpage>228</fpage>
          -
          <lpage>236</lpage>
          .
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [15].
          <string-name>
            <given-names>K.</given-names>
            <surname>Khadiev</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Safina</surname>
          </string-name>
          ,
          <article-title>Quantum algorithm for dynamic programming approach for dags. applications for Zhegalkin polynomial evaluation and some problems on DAGs</article-title>
          .
          <source>In Proceedings of UCNC</source>
          <year>2019</year>
          , volume
          <volume>4362</volume>
          <source>of LNCS</source>
          , pages
          <fpage>150</fpage>
          -
          <lpage>163</lpage>
          .
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [16
          <string-name>
            <given-names>].R.</given-names>
            <surname>Kohavi</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Quinlan</surname>
          </string-name>
          ,
          <article-title>Data mining tasks and methods: Classification: decision-tree discovery. Handbook of data mining and knowledge discovery</article-title>
          . Oxford University Press,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [17].
          <string-name>
            <surname>D. Kopczyk,</surname>
          </string-name>
          <article-title>Quantum machine learning for data scientists</article-title>
          .
          <source>arXiv preprint arXiv:1804.10068</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [18].L.
          <string-name>
            <surname>Breiman</surname>
            ,
            <given-names>J. H.</given-names>
          </string-name>
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>R. A.</given-names>
          </string-name>
          <string-name>
            <surname>Olshen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>J and Stone, Classification and regression trees</article-title>
          ,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [19]
          <string-name>
            <surname>.M. A Nielsen</surname>
            and
            <given-names>I. L</given-names>
          </string-name>
          <string-name>
            <surname>Chuang</surname>
          </string-name>
          .
          <article-title>Quantum computation and quantum information</article-title>
          . Cambridge univ. press,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [20].
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Quinlan</surname>
          </string-name>
          ,
          <article-title>Induction of decision trees</article-title>
          .
          <source>Machine learning</source>
          , pages
          <fpage>81</fpage>
          -
          <lpage>106</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [21].
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Quinlan</surname>
          </string-name>
          .
          <article-title>Improved use of continuous attributes in c4.5</article-title>
          .
          <source>Journal of Artificial Intelligence Research</source>
          , pages
          <fpage>77</fpage>
          -
          <lpage>90</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [22].
          <string-name>
            <given-names>L.</given-names>
            <surname>Rokach</surname>
          </string-name>
          and
          <string-name>
            <given-names>O.</given-names>
            <surname>Maimon</surname>
          </string-name>
          ,
          <article-title>Data mining with decision trees: theory and applications</article-title>
          ,
          <source>World Scientific Publishing Co. Pte. Ltd</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [23].
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuld</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sinayskiy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Petruccione</surname>
          </string-name>
          ,
          <article-title>The quest for a quantum neural network</article-title>
          .
          <source>Quantum Information Processing</source>
          ,
          <volume>13</volume>
          (
          <issue>11</issue>
          ):
          <fpage>2567</fpage>
          -
          <lpage>2586</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [24].
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuld</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sinayskiy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Petruccione</surname>
          </string-name>
          ,
          <article-title>An introduction to quantum machine learning</article-title>
          .
          <source>Contemporary Physics</source>
          ,
          <volume>56</volume>
          (
          <issue>2</issue>
          ),
          <fpage>172</fpage>
          -
          <lpage>185</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [25].
          <string-name>
            <given-names>J.</given-names>
            <surname>Su</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <article-title>A fast decision tree learning algorithm</article-title>
          , volume
          <volume>1</volume>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>