<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Sorting by Decision Trees with Hypotheses (extended abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohammad Azad</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Igor Chikalov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shahid Hussain</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mikhail Moshkov</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Business Administration</institution>
          ,
          <addr-line>University Road, Karachi 75270</addr-line>
          ,
          <country country="PK">Pakistan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Intel Corporation</institution>
          ,
          <addr-line>5000 W Chandler Blvd, Chandler, AZ 85226</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Jouf University</institution>
          ,
          <addr-line>Sakaka 72441</addr-line>
          ,
          <country country="SA">Saudi Arabia</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>King Abdullah University of Science and Technology (KAUST)</institution>
          ,
          <addr-line>Thuwal 23955-6900</addr-line>
          ,
          <country country="SA">Saudi Arabia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we consider decision trees that use both queries based on one attribute each and queries based on hypotheses about values of all attributes. Such decision trees are similar to ones studied in exact learning, where not only membership but also equivalence queries are allowed. For  = 3, . . . , 6, we compare decision trees based on various combinations of attributes and hypotheses for sorting  pairwise diferent elements from linearly ordered set.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;decision tree</kwd>
        <kwd>hypothesis</kwd>
        <kwd>dynamic programming</kwd>
        <kwd>sorting</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Decision trees are widely used in many areas of computer science, for example, test theory
(initiated by Chegis and Yablonskii [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]), rough set theory (initiated by Pawlak [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]), and
exact learning (initiated by Angluin [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]). These theories are closely related: attributes from
rough set theory and test theory correspond to membership queries from exact learning. Exact
learning also studies equivalence queries. The notion of “minimally adequate teacher” using
both membership and equivalence queries was discussed by Angluin in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Relations between
exact learning and PAC learning proposed by Valiant [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] were considered in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ], we added the notion of a hypothesis (an analog of equivalence queries) to the
model considered in both rough set theory and test theory and proposed dynamic programming
algorithms for the optimization of the decision trees with hypotheses. Note that the dynamic
programming algorithms for the optimization of the conventional decision trees that do not use
hypotheses were proposed earlier [12].
      </p>
      <p>
        In the present paper, we consider an application of the dynamic programming algorithms
from [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ] to the study of the problem of sorting. We compare the complexity of five types
of optimal (relative to the depth and relative to the number of realizable nodes) decision trees
based on various combinations of attributes and hypotheses for sorting  pairwise diferent
elements from linearly ordered set,  = 3, . . . , 6. Results obtained for the conventional decision
trees are known – see book [12]. Results obtained for the decision trees with hypotheses are
completely new.
      </p>
      <p>
        Note that in the present paper we follow [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] when discuss the notions related to the decision
trees with hypotheses. Complete definitions of these notions can be found in the same paper.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Five Types of Decision Trees and Their Optimization</title>
      <p>Let  be a decision table with  conditional attributes 1, . . . ,  that have values from the
set  = {0, 1, 2, . . .}. Rows of this table are pairwise diferent and each row is labeled with a
decision. For a given row of  , we should recognize the decision attached to it. To this end,
we will use decision trees based on two types of queries. We can ask about the value of a
conditional attribute  ∈ {1, . . . , } on the given row. As a result, we obtain an answer of
the kind  =  , where  is the number in the intersection of the given row and the column .
We can also ask if a hypothesis {1 =  1, . . . ,  =  } is true, where the numbers  1, . . . ,  
belong to the columns 1, . . . , , respectively. Either this hypothesis is confirmed or we obtain
a counterexample of the kind  =  , where  ∈ {1, . . . , } and  is a number from the
column  that is diferent from  . We will say that this hypothesis is proper if ( 1, . . . ,  ) is a
row of the table  .</p>
      <p>We study the following five types of decision trees:
1. Decision trees based on attributes only.
2. Decision trees based on hypotheses only.
3. Decision trees based on both attributes and hypotheses.
4. Decision trees based on proper hypotheses only.
5. Decision trees based on both attributes and proper hypotheses.</p>
      <p>As time complexity of a decision tree we consider its depth, which is equal to the maximum
number of queries in a path from the root to a terminal node of the tree. We consider the
number of realizable relative to  nodes in a decision tree as its space complexity. A node is
called realizable relative to  if the computation in the tree will pass through this node for some
row and some choice of counterexamples. We use the following notation:
• ℎ()( ) denotes the minimum depth of a decision tree of the type  for  ,  = 1, . . . , 5.
• ()( ) denotes the minimum number of nodes realizable relative to  in a decision tree
of the type  for  ,  = 1, . . . , 5.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], dynamic programming algorithms for the optimization of decision trees of
all five types relative to the depth and the number of realizable nodes were proposed (see also
journal extension [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] of these papers that considers additionally two cost functions: the number
of realizable terminal nodes and the number of nonterminal nodes). Note that algorithms for
the minimization of the depth and number of nodes for decision trees of the type 1 were
considered in [12] for decision tables with one-valued decisions and in [13] for decision tables
with many-valued decisions.
      </p>
      <p>Dynamic programming optimization algorithms are applicable to medium-sized decision
tables. These algorithms first construct a directed acyclic graph (DAG) whose nodes are some
subtables of the original decision table given by conditions of the type “attribute = value”. Then
they pass through all the nodes of the DAG, starting with the simplest subtables, and for each
subtable they find the minimum value of the considered cost function.</p>
      <p>
        In the present paper, we use algorithms proposed in [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ] to study decision trees of all
ifve types optimal relative to the depth and relative to the number of realizable nodes for the
sorting problem. Results for decision trees of the type 1 were obtained earlier [12]. Results for
decision trees of the types 2–5 are new.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Problem of Sorting</title>
      <p>In this paper, we study the problem of sorting  elements. Let 1, . . . ,  be pairwise diferent
elements from a linearly ordered set. We should find a permutation (1, . . . , ) from the set
 of all permutations of the set {1, . . . , } for which 1 &lt; · · · &lt;  . To this end, we use
attributes  :  such that ,  ∈ {1, . . . , },  &lt; ,  :  = 1 if  &lt;  , and  :  = 0 if
 &gt;  .</p>
      <p>The problem of sorting  elements can be represented as a decision table  with ( −
1)/2 conditional attributes  :  , ,  ∈ {1, . . . , },  &lt; , and ! rows corresponding to
permutations from . For each permutation (1, . . . , ), the corresponding row of  is
labeled with this permutation as the decision. This row is filled with values of attributes  : 
such that  :  = 1 if and only if  stays before  in the tuple (1, . . . , ).</p>
      <p>
        For  = 3, . . . , 6 and  = 1, . . . , 5, we find values of ℎ()() and ()() using dynamic
programming algorithms described in [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ] – see results in Tables 1 and 2.
      </p>
      <p>From the obtained experimental results it follows that the decision trees of the types 2–5 can
have less depth than the decision trees of the type 1. Decision trees of the types 3 and 5 can
have less number of realizable nodes than the decision trees of the type 1. Decision trees of the
types 2 and 4 have too many nodes.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>In this paper, we found the minimum depth and the minimum number of realizable nodes of
ifve types of decision trees for sorting  elements,  = 3, . . . , 6.</p>
      <p>In the future, we are planning to study joint behavior of the depth and the number of nodes in
such decision trees. It would be also interesting to compare the complexity of optimal decision
trees of the considered five types constructed by dynamic programming algorithms and the
complexity of decision trees constructed by entropy-based greedy algorithm proposed in [14].</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>Research reported in this publication was supported by King Abdullah University of Science
and Technology (KAUST). The authors are indebted to the anonymous reviewers for interesting
comments.
[12] H. AbouEisha, T. Amin, I. Chikalov, S. Hussain, M. Moshkov, Extensions of Dynamic
Programming for Combinatorial Optimization and Data Mining, volume 146 of Intelligent
Systems Reference Library, Springer, 2019.
[13] F. Alsolami, M. Azad, I. Chikalov, M. Moshkov, Decision and Inhibitory Trees and Rules for
Decision Tables with Many-valued Decisions, volume 156 of Intelligent Systems Reference
Library, Springer, 2020.
[14] M. Azad, I. Chikalov, S. Hussain, M. Moshkov, Entropy-based greedy algorithm for decision
trees using hypotheses, Entropy 23 (2021) 808. URL: https://doi.org/10.3390/e23070808.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>I. A.</given-names>
            <surname>Chegis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Yablonskii</surname>
          </string-name>
          ,
          <article-title>Logical methods of control of work of electric schemes</article-title>
          ,
          <source>Trudy Mat. Inst</source>
          . Steklov (in Russian)
          <volume>51</volume>
          (
          <year>1958</year>
          )
          <fpage>270</fpage>
          -
          <lpage>360</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pawlak</surname>
          </string-name>
          , Rough sets,
          <source>Int. J. Parallel Program</source>
          .
          <volume>11</volume>
          (
          <year>1982</year>
          )
          <fpage>341</fpage>
          -
          <lpage>356</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pawlak</surname>
          </string-name>
          , Rough Sets - Theoretical
          <source>Aspects of Reasoning about Data</source>
          , volume
          <volume>9</volume>
          of Theory and Decision Library: Series D, Kluwer,
          <year>1991</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pawlak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Skowron</surname>
          </string-name>
          , Rudiments of rough sets,
          <source>Inf. Sci</source>
          .
          <volume>177</volume>
          (
          <year>2007</year>
          )
          <fpage>3</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Angluin</surname>
          </string-name>
          ,
          <article-title>Queries and concept learning</article-title>
          ,
          <source>Mach. Learn</source>
          .
          <volume>2</volume>
          (
          <year>1988</year>
          )
          <fpage>319</fpage>
          -
          <lpage>342</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Angluin</surname>
          </string-name>
          , Queries revisited,
          <source>Theor. Comput. Sci</source>
          .
          <volume>313</volume>
          (
          <year>2004</year>
          )
          <fpage>175</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Angluin</surname>
          </string-name>
          ,
          <article-title>Learning regular sets from queries and counterexamples</article-title>
          ,
          <source>Inf. Comput</source>
          .
          <volume>75</volume>
          (
          <year>1987</year>
          )
          <fpage>87</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Valiant</surname>
          </string-name>
          ,
          <article-title>A theory of the learnable</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>27</volume>
          (
          <year>1984</year>
          )
          <fpage>1134</fpage>
          -
          <lpage>1142</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Azad</surname>
          </string-name>
          , I. Chikalov,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hussain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Moshkov</surname>
          </string-name>
          ,
          <article-title>Minimizing depth of decision trees with hypotheses (to appear)</article-title>
          ,
          <source>in: International Joint Conference on Rough Sets (IJCRS</source>
          <year>2021</year>
          ),
          <fpage>19</fpage>
          -24
          <source>September</source>
          <year>2021</year>
          , Bratislava, Slovakia,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Azad</surname>
          </string-name>
          , I. Chikalov,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hussain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Moshkov</surname>
          </string-name>
          ,
          <article-title>Minimizing number of nodes in decision trees with hypotheses (to appear)</article-title>
          ,
          <source>in: 25th International Conference on Knowledge-Based and Intelligent Information &amp; Engineering Systems (KES</source>
          <year>2021</year>
          ),
          <fpage>8</fpage>
          -
          <issue>10</issue>
          <year>September 2021</year>
          , Szczecin, Poland,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Azad</surname>
          </string-name>
          , I. Chikalov,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hussain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Moshkov</surname>
          </string-name>
          ,
          <article-title>Optimization of decision trees with hypotheses for knowledge representation</article-title>
          ,
          <source>Electronics</source>
          <volume>10</volume>
          (
          <year>2021</year>
          )
          <article-title>1580</article-title>
          . URL: https://doi.org/10. 3390/electronics10131580.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>