1. Introduction

Sorting by Decision Trees with Hypotheses (extended abstract)

Mohammad Azad

Igor Chikalov

Shahid Hussain

Mikhail Moshkov

3 0 Institute of Business Administration , University Road, Karachi 75270 , Pakistan 1 Intel Corporation , 5000 W Chandler Blvd, Chandler, AZ 85226 , USA 2 Jouf University , Sakaka 72441 , Saudi Arabia 3 King Abdullah University of Science and Technology (KAUST) , Thuwal 23955-6900 , Saudi Arabia

In this paper, we consider decision trees that use both queries based on one attribute each and queries based on hypotheses about values of all attributes. Such decision trees are similar to ones studied in exact learning, where not only membership but also equivalence queries are allowed. For = 3, . . . , 6, we compare decision trees based on various combinations of attributes and hypotheses for sorting pairwise diferent elements from linearly ordered set.

eol>decision tree hypothesis dynamic programming sorting

1. Introduction

Decision trees are widely used in many areas of computer science, for example, test theory (initiated by Chegis and Yablonskii [ 1 ]), rough set theory (initiated by Pawlak [ 2, 3, 4 ]), and exact learning (initiated by Angluin [ 5, 6 ]). These theories are closely related: attributes from rough set theory and test theory correspond to membership queries from exact learning. Exact learning also studies equivalence queries. The notion of “minimally adequate teacher” using both membership and equivalence queries was discussed by Angluin in [ 7 ]. Relations between exact learning and PAC learning proposed by Valiant [ 8 ] were considered in [ 5 ].

In [ 9, 10, 11 ], we added the notion of a hypothesis (an analog of equivalence queries) to the model considered in both rough set theory and test theory and proposed dynamic programming algorithms for the optimization of the decision trees with hypotheses. Note that the dynamic programming algorithms for the optimization of the conventional decision trees that do not use hypotheses were proposed earlier [12].

In the present paper, we consider an application of the dynamic programming algorithms from [ 9, 10, 11 ] to the study of the problem of sorting. We compare the complexity of five types of optimal (relative to the depth and relative to the number of realizable nodes) decision trees based on various combinations of attributes and hypotheses for sorting pairwise diferent elements from linearly ordered set, = 3, . . . , 6. Results obtained for the conventional decision trees are known – see book [12]. Results obtained for the decision trees with hypotheses are completely new.

Note that in the present paper we follow [ 11 ] when discuss the notions related to the decision trees with hypotheses. Complete definitions of these notions can be found in the same paper.

2. Five Types of Decision Trees and Their Optimization

Let be a decision table with conditional attributes 1, . . . , that have values from the set = {0, 1, 2, . . .}. Rows of this table are pairwise diferent and each row is labeled with a decision. For a given row of , we should recognize the decision attached to it. To this end, we will use decision trees based on two types of queries. We can ask about the value of a conditional attribute ∈ {1, . . . , } on the given row. As a result, we obtain an answer of the kind = , where is the number in the intersection of the given row and the column . We can also ask if a hypothesis {1 = 1, . . . , = } is true, where the numbers 1, . . . , belong to the columns 1, . . . , , respectively. Either this hypothesis is confirmed or we obtain a counterexample of the kind = , where ∈ {1, . . . , } and is a number from the column that is diferent from . We will say that this hypothesis is proper if ( 1, . . . , ) is a row of the table .

We study the following five types of decision trees: 1. Decision trees based on attributes only. 2. Decision trees based on hypotheses only. 3. Decision trees based on both attributes and hypotheses. 4. Decision trees based on proper hypotheses only. 5. Decision trees based on both attributes and proper hypotheses.

As time complexity of a decision tree we consider its depth, which is equal to the maximum number of queries in a path from the root to a terminal node of the tree. We consider the number of realizable relative to nodes in a decision tree as its space complexity. A node is called realizable relative to if the computation in the tree will pass through this node for some row and some choice of counterexamples. We use the following notation: • ℎ()( ) denotes the minimum depth of a decision tree of the type for , = 1, . . . , 5. • ()( ) denotes the minimum number of nodes realizable relative to in a decision tree of the type for , = 1, . . . , 5.

In [ 9 ] and [ 10 ], dynamic programming algorithms for the optimization of decision trees of all five types relative to the depth and the number of realizable nodes were proposed (see also journal extension [ 11 ] of these papers that considers additionally two cost functions: the number of realizable terminal nodes and the number of nonterminal nodes). Note that algorithms for the minimization of the depth and number of nodes for decision trees of the type 1 were considered in [12] for decision tables with one-valued decisions and in [13] for decision tables with many-valued decisions.

Dynamic programming optimization algorithms are applicable to medium-sized decision tables. These algorithms first construct a directed acyclic graph (DAG) whose nodes are some subtables of the original decision table given by conditions of the type “attribute = value”. Then they pass through all the nodes of the DAG, starting with the simplest subtables, and for each subtable they find the minimum value of the considered cost function.

In the present paper, we use algorithms proposed in [ 9, 10, 11 ] to study decision trees of all ifve types optimal relative to the depth and relative to the number of realizable nodes for the sorting problem. Results for decision trees of the type 1 were obtained earlier [12]. Results for decision trees of the types 2–5 are new.

3. Problem of Sorting

In this paper, we study the problem of sorting elements. Let 1, . . . , be pairwise diferent elements from a linearly ordered set. We should find a permutation (1, . . . , ) from the set of all permutations of the set {1, . . . , } for which 1 < · · · < . To this end, we use attributes : such that , ∈ {1, . . . , }, < , : = 1 if < , and : = 0 if > .

The problem of sorting elements can be represented as a decision table with ( − 1)/2 conditional attributes : , , ∈ {1, . . . , }, < , and ! rows corresponding to permutations from . For each permutation (1, . . . , ), the corresponding row of is labeled with this permutation as the decision. This row is filled with values of attributes : such that : = 1 if and only if stays before in the tuple (1, . . . , ).

For = 3, . . . , 6 and = 1, . . . , 5, we find values of ℎ()() and ()() using dynamic programming algorithms described in [ 9, 10, 11 ] – see results in Tables 1 and 2.

From the obtained experimental results it follows that the decision trees of the types 2–5 can have less depth than the decision trees of the type 1. Decision trees of the types 3 and 5 can have less number of realizable nodes than the decision trees of the type 1. Decision trees of the types 2 and 4 have too many nodes.

4. Conclusions

In this paper, we found the minimum depth and the minimum number of realizable nodes of ifve types of decision trees for sorting elements, = 3, . . . , 6.

In the future, we are planning to study joint behavior of the depth and the number of nodes in such decision trees. It would be also interesting to compare the complexity of optimal decision trees of the considered five types constructed by dynamic programming algorithms and the complexity of decision trees constructed by entropy-based greedy algorithm proposed in [14].

Acknowledgments

Research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST). The authors are indebted to the anonymous reviewers for interesting comments. [12] H. AbouEisha, T. Amin, I. Chikalov, S. Hussain, M. Moshkov, Extensions of Dynamic Programming for Combinatorial Optimization and Data Mining, volume 146 of Intelligent Systems Reference Library, Springer, 2019. [13] F. Alsolami, M. Azad, I. Chikalov, M. Moshkov, Decision and Inhibitory Trees and Rules for Decision Tables with Many-valued Decisions, volume 156 of Intelligent Systems Reference Library, Springer, 2020. [14] M. Azad, I. Chikalov, S. Hussain, M. Moshkov, Entropy-based greedy algorithm for decision trees using hypotheses, Entropy 23 (2021) 808. URL: https://doi.org/10.3390/e23070808.

[1]

I. A.

Chegis ,

S. V.

Yablonskii , Logical methods of control of work of electric schemes , Trudy Mat. Inst . Steklov (in Russian) 51 ( 1958 ) 270 - 360 .

[2]

Pawlak , Rough sets, Int. J. Parallel Program . 11 ( 1982 ) 341 - 356 .

[3]

Pawlak , Rough Sets - Theoretical Aspects of Reasoning about Data , volume 9 of Theory and Decision Library: Series D, Kluwer, 1991 .

[4]

Pawlak ,

Skowron , Rudiments of rough sets, Inf. Sci . 177 ( 2007 ) 3 - 27 .

[5]

Angluin , Queries and concept learning , Mach. Learn . 2 ( 1988 ) 319 - 342 .

[6]

Angluin , Queries revisited, Theor. Comput. Sci . 313 ( 2004 ) 175 - 194 .

[7]

Angluin , Learning regular sets from queries and counterexamples , Inf. Comput . 75 ( 1987 ) 87 - 106 .

[8]

L. G.

Valiant , A theory of the learnable , Commun. ACM 27 ( 1984 ) 1134 - 1142 .

[9]

Azad , I. Chikalov,

Hussain ,

Moshkov , Minimizing depth of decision trees with hypotheses (to appear) , in: International Joint Conference on Rough Sets (IJCRS 2021 ), 19 -24 September 2021 , Bratislava, Slovakia, 2021 .

[10]

Azad , I. Chikalov,

Hussain ,

Moshkov , Minimizing number of nodes in decision trees with hypotheses (to appear) , in: 25th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2021 ), 8 - 10 September 2021 , Szczecin, Poland, 2021 .

[11]

Azad , I. Chikalov,

Hussain ,

Moshkov , Optimization of decision trees with hypotheses for knowledge representation , Electronics 10 ( 2021 ) 1580 . URL: https://doi.org/10. 3390/electronics10131580.