<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Safety Augmentation in Decision Trees</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sumanta Dey</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pallab Dasgupta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Briti Gangopadhyay</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dept of Computer Science</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Engineering</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>IIT Kharagpur</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>India sumanta.dey@iitkgp.ac.in</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>pallab@cse.iitkgp.ac.in</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>briti gangopadhyay@iitkgp.ac.in</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>(CapShape = bell</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>GillColor = green)</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Decision trees are among the most popular representations used in statistical machine learning. This paper studies the problem of incorporating domain knowledge into the learning of decision trees, specifically, when the knowledge is given in the form of a set of known decision cases, captured using assertions. The ability to ensure that the decision making never violates a known set of safety assertions is essential for the use of decision tree learning in safety critical domains. A pure machine learning approach does not guarantee safety, since the learning can be impaired by the lack of adequate data on possible failure scenarios. To the best of our knowledge, this is the first work on formal safety augmentation in decision trees with provable guarantees.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Decision Tree Learning is among the most widely used and
practical methods for supervised learning for both
classification and regression tasks [Breiman, 2017; Song and Ying,
2015]. Decisions based on decision trees are inherently
explainable and statistically supported by the data used to learn
the tree. The primary goal of a decision tree learning
algorithm is to construct an optimal decision tree, where the
quality is defined in terms of the depth of the tree (which
is an upper-bound on the number of variables to be
examined before reaching a decision) and the number of nodes in
the tree (which reflects the average effort in reaching a
decision). Finding an optimal decision tree is an NP-Complete
problem [Laurent and Rivest, 1976; Murthy, 1998]. There
are several greedy algorithms for learning decision trees,
including ID3, C4.5, C5.0 [Quinlan, 1986; Quinlan, 2014;
Quinlan, 2007]. Greedy metrics, such as Information Gain,
are used to order the nodes in the decision tree during the
construction. For example, the decision tree developed using
ID3 for the dataset of Table 1 is shown in Figure 1a. This
decision tree decides whether a type of Mushroom with given
features like the CapShape, the CapColor, and the GillColor
is poisonous or edible based on the information gained from
the given dataset. As can be seen in Figure 1a, the decision
is taken by examining only the features, CapColor, and
CapShape. The feature, GillColor, is ignored by the decision tree,
and this can be statistically justified from the dataset.</p>
      <p>For safety-critical decisions, relying on a decision tree
constructed using statistical machine learning can be risky,
because the dataset will typically not contain samples covering
all possible combinations of features. The missing
combinations may either not exist in the real world or maybe so
rare that they do not show up in the dataset. This is often
true when the number of features and their value domains are
large. It is, therefore, possible that the decision tree will make
a wrong decision because it has no supervised precedence for
unobserved combinations of feature values. For example, we
may know from scientific knowledge that mushrooms
having green GillColor and CapShape like a bell are definitely
poisonous, though there is no support for this knowledge in
the data. Consequently, the decision tree of Figure 1a
recommends all mushrooms with yellow CapColor, ignoring the
features CapShape and GillColor, possibly leading to
disastrous consequences. The decision tree also declares as edible
all mushrooms with white CapColor and bell-like CapShape,
whereas the above scientific knowledge should be used to
eliminate the ones with green GillColor from this set.
Safety requirements gleaned from (non-statistical) domain
knowledge can be provided explicitly in the form of
assertions, a practice which is widely used in many safety-critical
domains [SVA, 2018; Peled et al., 1999]. For example, the
knowledge on mushrooms with green GillColor and bell-like
CapShape can be expressed as the following propositional
assertion:
This paper studies the augmentation of decision trees with
such safety assertions. We show that naive post-facto
processing of a decision tree with the given assertions adversely
affects the size of the decision tree. For example, such an
augmentation with the above assertion leads to the decision
tree of Figure 1b, where the average number of variables to
be examined before reaching a decision is significantly more
than that of Figure 1a.</p>
      <p>We present a methodology for biasing the information gain
metric based on the safety assertions to compensate for the
missing information in the dataset. This leads to considerable
improvement in the safety-augmentation process. The
modified decision tree using our proposed methodology is shown
in Figure 2. The average number of variables to be examined
before reaching a decision in this tree is significantly better
than that of Figure 1b.</p>
      <p>The paper is organized as follows. Section 2 outlines
the post-facto safety augmentation methodology. Section 3
presents the proposed integrated safety augmentation
approach, and Section 4 outlines multi-assertion extensions.
Experimental results on benchmark datasets are presented in
Section 5. Related work and concluding remarks are given in
sections 6 and 7 respectively.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Post-Facto Safety Augmentation</title>
      <p>This section examines the merit of considering the safety
assertions after the decision tree has been constructed. We
consider the following two options:
1. Analyzing safety assertions before using the decision
tree. This is typically not a good option in practice.
Safety-critical scenarios often have specific attributes
which need not be examined if the decision is safe
anyway. For example, suppose a safety property specifies
that: sulfa drugs should not be administered unless a
sensitivity test rules out allergic reaction. It does not
make sense to perform a sensitivity test for allergic
reactions to sulfa drugs before the decision tree recommends
such a drug. In other words, examining all the predicates
in the antecedent of an assertion is redundant in those
paths of the decision tree where the decision is safe
anyway.
2. Analyzing safety assertions after using the decision tree.</p>
      <p>We may analyze the decision recommended by the
decision tree against the safety properties. For example,
the decision tree of Figure 1a recommends us to eat
mushrooms having white CapColor and bell-like
CapShape, whereas the safety assertion tells us that
mushrooms with bell-like CapShape and green GillColor are
poisonous. We must therefore not go by the decision
recommended by the decision tree, but examine the
GillColor. This post-facto safety augmentation is shown in
Figure 1b.</p>
      <p>For a leaf node, v, of the decision tree, let (v) denote the
conjunction of predicates leading to v, and let dec(v) denote
the decision recommended at v. For example, in Figure 1a, if
v is the second leaf node from the left, then dec(v) = edible
and:</p>
      <p>(v) = CapColor = white ^ CapShape = bell
For an assertion, ', let ant(') denote the antecedent of '.
For example, for the assertion, ':
(CapShape = bell ^ GillColor = green) ) P oisonous
ant(') = (CapShape = bell ^ GillColor = green)
Lemma 2.1 Safety augmentation is necessary in a leaf node
v of a decision tree iff (v) ^ ant(') is satisfiable, but dec(v)
refutes the consequent of '.</p>
      <p>Proof: If (v) ^ ant(') is False (unsatisfiable), then no
safety augmentation is needed at v because the antecedent of
' does not match on this path of the decision tree. If dec(v)
satisfies the consequent of ', then no safety augmentation
is needed at v because the decision tree recommends a safe
decision anyway. Therefore it suffices to consider only the
cases where (v) ^ ant(') is satisfiable, but dec(v) refutes
the consequent of '. The necessity is obvious since dec(v)
refutes the consequent of ', which will be a wrong decision
for the cases satisfying (v) ^ ant(').</p>
      <p>The methodology for post-facto safety augmentation in the
necessary leaf nodes identified by Lemma 2.1 is as follows:
Let (') denote the set of variables referenced in
ant(') but not present in (v). For example, for
the rightmost leaf node of Figure 1a, eta(') =
fCapShape; GillColorg.</p>
      <p>A sequence of nodes corresponding to the variables in
(') replaces the leaf node. This sequence is shown in
red in Figure 1b.</p>
      <p>Decisions corresponding to the remaining branches of
the new subtrees are recomputed using the decision tree
learning algorithm.</p>
      <p>The need for recomputing the decisions in the new branches
of the decision tree arises when the decision taken at the
original node was due to the statistical majority and not due to
unanimity. In order to avoid overfitting, decision tree
learning algorithms can take a specific decision at a node when the
(a) Decision Tree from Table 1.</p>
      <p>(b) Decision Tree after Post-facto Safety Augmentation.
samples in favor of that decision is in overwhelming
majority. The safety augmentation splits these samples into the new
branches of the decision tree, and in some of the branches,
the overwhelming majority may not hold anymore. Hence it
is necessary to run the decision tree learning algorithm on the
new branches.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Integrated Safety Augmentation</title>
      <p>The post-facto safety augmentation approaches of Section 2
do not consider the information gain resulting out of the
safety assertions, and may, therefore, produce safe decision
trees which are suboptimal. For example, the decision tree
shown in Figure 2 is also a safe decision tree, but it uses fewer
variables in the rightmost branch as compared to Figure 1b.
Specifically, the tree of Figure 2 takes advantage of the fact
that GillColor is sufficient to separate the edible and
poisonous mushrooms when CapColor is yellow. Such choices
can be made when the safety augmentation is integrated with
the learning algorithm of the decision tree.</p>
      <p>We elucidate the integrated safety augmentation approach
by considering the ID3 algorithm [Quinlan, 1986]. Given a
set of training samples that are individually tagged with the
decision, the key step of the algorithm is to choose the
variable at the root of the decision tree based on an information
gain metric. Once the root is chosen, the algorithm is
recursively applied on each branch of the root corresponding to the
various values that the root variable can take. Formally, the
information gain, I G(ajt), of attribute a at the end of branch
t is:</p>
      <p>I G(ajt) = E(ajt)
N (t ^ a[vj ]) is the count of all decision variable class
after branch t and attribute a’s branch a[vj ]. Here a[vj ]
is the jth value in attribute a’s domain.
i=1
where, k is the number of different decision variable classes
and p(i) is the probability of decision variable class i, and
therefore:</p>
      <p>At the root of the decision tree, branch t will be empty. The
ID3 algorithm calculates the Information Gain for all the
attributes using the above formulae and selects the attribute
having the highest Information Gain as the root node. Each
value of the chosen attribute becomes the branch of the root
node. The training dataset is then divided corresponding to
the decision tree branches. Then the same procedure is
recursively applied on each of the branches with the remaining
attributes and the data items, until all the data items have the
same decision value (or we choose to prune the tree at that
branch by choosing majority decision). We then add a leaf
node with the decision value.
The reason why a decision tree constructed out of a given data
set violates a safety assertion is that samples corresponding to
the counter-examples were not present in the data. We may
address this issue by augmenting the data set with additional
samples that are relevant to the assertion. For example,
consider the mushroom data set of Table 1 and the assertion, ':
(CapShape = bell ^ GillColor = green) ) P oisonous</p>
      <p>Adding these rows into the data set will ensure that the
decision tree learned using ID3 is safe with respect to the
assertion, '. Formally, let A = fA1; : : : ; Ang denote the set
of attributes in the data set, and Q A denote the set of
attributes referred in ant('). Let dom(Ai) denote the set of
values of attribute Ai. The safety augmentation data
corresponding to ' is defined as follows:
Aug(') = fhv1; : : : ; vn; vdi j 8i; vi 2 dom(Ai) and
hv1; : : : ; vni satisfies ant(') and the decision
variable vd satisfies the consequent of ' g
Lemma 3.1 The decision tree constructed by the ID3
algorithm on the data set augmented with Aug(') is safe with
respect to assertion '.</p>
      <p>Proof sketch: Let us assume the contrary. Then there
exists a leaf node v in the decision tree such dec(v) refutes the
consequent of ' and (v) ^ ant(') is satisfiable. But, in
the augmented data set, by the definition of Aug('), at least
50% of the samples at v have a decision which does to satisfy
dec(v). Therefore, ID3 can never prune the tree at v with a
decision which disagrees with 50% of the samples.
3.2</p>
      <sec id="sec-3-1">
        <title>The Safety Augmentation Methodology</title>
        <p>The proposed safety augmentation methodology does not
actually add Aug(') explicitly in the data set. Essentially the
ID3 algorithm chooses the attributes corresponding to each
node in the decision tree using the information gain
metric, and this metric only requires the count of the samples
having a specific decision for each value in the domain of
an attribute. The proposed methodology is implemented by
manipulating these counts based on our understanding of
Aug('). Formally, we replace Equation 3 with the
following:
where N (ijt) it the count of the decision variable class i after
the branch t and N (t) is the count of all decision variable
classes after the branch t. As we are also considering the
count from Aug('), we have:</p>
        <p>N (ijt) = R(ijt) + S(ijt)
where, R(ijt) is the count of decision variable class i in the
input dataset after the branch t, and S(ijt) is the count of
decision variable class i corresponding to ' after the branch
t.</p>
        <p>The modified information gain is calculated thereafter in
the standard way:
(4)
(5)
IG(ajt) = E(ajt)</p>
        <p>Xl N (t ^ a[vj ])
j=1</p>
        <p>N (t)
Post-facto safety augmentation does not affect the relative
positions of the attributes in the decision tree, since the method
splits those leaf nodes where the decision contradicts the
safety requirement, while the other nodes in the tree are not
disturbed. The integrated safety augmentation method
produces a more optimal decision tree by augmenting the safety
into the information gain metric and then allowing the ID3
algorithm to proceed with the modified information gain. In
this section, we find out whether this augmentation introduces
a bias in the input, thereby affecting the choice among the
other attributes in the system.</p>
        <p>Let us return to our safety assertion, ':
(CapShape = bell ^ GillColor = green) ) P oisonous
As explained in Section 3.1, the proposed integrated method
will affect the count corresponding to the following cases:</p>
      </sec>
      <sec id="sec-3-2">
        <title>CapShape</title>
        <p>Bell
Bell</p>
      </sec>
      <sec id="sec-3-3">
        <title>CapColor</title>
        <p>White
Yellow</p>
      </sec>
      <sec id="sec-3-4">
        <title>GillColor</title>
        <p>Green
Green</p>
      </sec>
      <sec id="sec-3-5">
        <title>Poisonous</title>
        <p>Poisonous</p>
        <p>Poisonous</p>
        <p>The augmentation methodology inherently assumes that
these types of mushrooms, which are not represented in the
data, are actually possible. This need not be true, and in
nature, we may not have a mushroom of yellow CapColor,
which has a bell-like CapShape and green GillColor, as
required by ant('). Formally, safety augmentation assumes
that the attributes referenced in ant(') are independent of
each other and that all combinations of values of these
attributes can occur.
(a) Information Gain for the attribute C and D of the toy example (b) Information Gain for the attribute C and D of the toy example
without considering the support dataset. considering the support dataset.</p>
        <p>Such an assumption may create a bias in the input to the
decision tree learning algorithm. This is explained with the
case shown in Figure 3. We consider the information gain
of attributes C and D in a setting where C, D, and the
decision variables are all Boolean. Figure 3a shows the case
when no safety augmentation has been done. In this case, at
a branch of the decision tree, we have 10 samples where the
decision is True and 8 samples where the decision is False.</p>
        <p>Figure 3a shows the pattern in which the attributes C and
D splits the samples with the decisions True and False.
Accordingly, IG(C) and IG(D) are computed as shown, and
therefore, ID3 prefers D over C as the next attribute.</p>
        <p>Figure 3b shows the case when safety augmentation is used
to modify the counts. In this case, the safe decision is False,
hence only the counts of samples with decision False are
affected by the augmentation. The original ratio of (10 : 8)
has now become (10 : 12). Figure 3b shows the ratios of
True/False decision when split by the the attributes C and D.</p>
        <p>Accordingly, IG(C) and IG(D) are computed as shown, and
therefore, now ID3 prefers C over D as the next attribute.</p>
        <p>The important thing to note here is that such an inversion
of priority between attributes like C and D can happen even
when they do not belong to the antecedent of an assertion, .</p>
        <p>For any attribute A 62 ant( ), the count of the samples
corresponding to the safe decision will be equal on all branches
leading out of a node for A. In the specific case outlined
above, C and D were not in ant( ), and therefore the
increase in the counts of the False decision in both the left and
right branches of these nodes are equal, that is, 2. Yet the
priority inversion takes place, indicating the introduction of
bias.</p>
        <p>We believe that enforcing safety always comes at the
expense of bias. In reality, it is often not possible to
envisage all practicable ways in which a property can fail. Some
combinations of values for the variables in the antecedent of
an assertion may not be possible, but this is not known to
us. Therefore, the safety augmentation must assume that all
combinations are possible, even at the expense of the bias it
introduces. It is theoretically possible to examine ways to
minimize the effects of this bias on the construction of the
decision tree, but that is beyond the scope of this paper.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Multi-Assertion Safety Augmentation</title>
      <p>In general, safety may be defined in terms of multiple
assertions. Moreover, assertions may be causal, diagnostic, or a
combination of both. In this section, we discuss these variants
and show that the principle of safety augmentation presented
in the previous section can be applied to all such cases.</p>
      <p>Causal versus Diagnostic Assertions. It is well known
in the literature that in logic causal and diagnostic rules
are inter-convertible. For example, the causal rule:
Cavity ) Toothache can be rewritten as a diagnostic
rule: : Toothache ) : Cavity. We have presented
the safety augmentation methodology with assertions in
causal form, where the consequent is a predicate over
the decision variable. If the assertion is given in another
form, it can be rewritten in the causal form and then used
in our methodology.</p>
      <p>Assertions with the same consequent. Given two
assertions of the form 1 ) and 2 ) , we can perform
the safety augmentation in any order, or together by
augmenting the counts with the ways of satisfying 1 _ 2.</p>
      <p>Assertions with different consequents. Given two
assertions of the form 1 ) 1 and 2 ) 2, where</p>
      <p>1 6= 2, we can perform the safety augmentation in
any order. In this case we additionally need to ensure
that 1 ^ 2 is unsatisfiable, otherwise the safety
assertions contradict each other. This is done using a standard</p>
      <p>SAT engine.</p>
      <p>In general, we may also have assertions of the form ) ,
where neither , nor contains the decision variable. Such
assertions convey additional information about the variables,
which may not be explicit in the data. Such assertions can
also influence the counts during safety augmentation. For
example, suppose the safety assertion states:
(CapShape = bell ^ GillColor = green) ) P oisonous
In the absence of any other information, as discussed earlier,
we augment two cases, corresponding to CapColor of White
and Yellow. Now suppose we are given the assertion:</p>
      <p>(GillColor = Green) ) (CapColor = P ink)
This assertion does not provide any information regarding
the safety of mushrooms, but it tells us that mushrooms with
green GillColor always have pink CapColor. With this
information, the augmentation count drops from 2 to 0. The
safety augmentation methodology computes the
augmentation counts accordingly.
Dataset
(Age = (30 -39 ) ^ Tumor -Size = (30 -34 ) ^ Irradiation = Yes) ) Recurrence-Events
(Cap-Shape = Bell ^ Gill -Color = Green) ) Poisonous
(Stalk -color -above-ring = Bell ^ Stalk -color -below -ring = Green) ) Poisonous
(Student -Health = Not -Recommended ) ) Not -Recommended
(top-left -square = o ^ top-middle-square = o ^ top-right -square = o) ) x -losses
(middle-left -square = o ^ middle-middle-square = o ^ middle-right -square = o) ) x -losses
(bottom-left -square = o ^ bottom-middle-square = o ^ bottom-right -square = o) ) x -losses
We provide experimental results on four benchmark datasets
from the UCI Machine Learning Repository [Dua and Graff,
2017], namely Breast Cancer [Zwitter and Soklic, 1988],
Mushroom , Nursery, and Tic-Tac-Toe. The assertions are
shown in Table 2 were designed based on relevance and have
various degrees of support in the dataset.</p>
      <p>Table 3 compares the depth and the total number of nodes
in the decision trees created without safety augmentation with
those created with post-facto and integrated safety
augmentation. Also shown are the runtimes, for creating the
decision trees without safety augmentation, the additional time
for post-facto safety augmentation, and for creating the
decision trees with integrated safety augmentation.</p>
      <p>The effect of safety augmentation on the decision trees is
dependent on the missing support for the safety assertions in
the datasets. For example, the safety property for the
Nursery dataset has most of the support data in the original
support data and therefore requires negligible augmentation. On
the other hand, support for the assertions over the mushroom
dataset requires considerable augmentation, resulting in
significant changes due to safety augmentation. Not
surprisingly, integrated safety augmentation produces decision trees
with fewer nodes and lesser depth as compared to post-facto
safety augmentation.</p>
      <p>The runtimes indicate that the overhead of safety
augmentation is marginal in most cases.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Related Work</title>
      <p>Decision trees and similar structures are already in use in
critical safety domains. This includes Culpability Tree [Reason,
2016] for dealing with staff involved in safety errors in the
aviation industry, Incident Decision Trees [Meadows et al.,
2005] for helping National Health Service (NHS) managers
in the United Kingdom to determine a fair and consistent
course of action towards staff involved in patient safety
incidents. All these trees are either manually created by the
domain experts or by using domain-specific tools. There exists
learning algorithms and tools for building decision trees,
including ones that are robust against malicious attacks such as
data poisoning [Alfeld et al., 2016] and evasion attacks
[Biggio et al., 2013]. The tool called Antidote [Drews et al.,
2019] is used to check the robustness of decision tree
learning against data poisoning attacks, where an attacker can
inject several malicious elements into the training set to
influence the learned model. The decision tree learning algorithm
Treant [Calzavara et al., 2019] generates decision tree
ensembles that are accurate and nearly insensitive to evasion attacks
based on a formal threat model, and minimizes an
evasionaware loss function at each step of the tree construction. In
contrast, our approaches build decision trees where some of
the branches are responsible for predicting critical decisions,
are modified according to the defined safety constraints, and
thus provides the robustness against those defined safety
constraints.
7</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>Safety augmentation is one of the primary requirements in
structures learned from data using machine learning
techniques, especially when the learned function is used in a
safety critical context. This paper examines the safety
augmentation problem for decision trees, which are popular in
practice. We present the first methodology for safety
augmentation in decision trees where the safety requirement is
expressed in terms of assertions. Our results indicate that
augmenting the information gain metrics with knowledge
gleaned from the safety assertions yields safe decision trees
which are considerably smaller than ones obtained by
postfacto safety augmentation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Alfeld et al.,
          <year>2016</year>
          ]
          <string-name>
            <given-names>Scott</given-names>
            <surname>Alfeld</surname>
          </string-name>
          , Xiaojin Zhu, and
          <string-name>
            <given-names>Paul</given-names>
            <surname>Barford</surname>
          </string-name>
          .
          <article-title>Data poisoning attacks against autoregressive models</article-title>
          .
          <source>In 30th AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Biggio et al.,
          <year>2013</year>
          ]
          <string-name>
            <given-names>Battista</given-names>
            <surname>Biggio</surname>
          </string-name>
          , Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Sˇ rndic´,
          <string-name>
            <surname>Pavel</surname>
            <given-names>Laskov</given-names>
          </string-name>
          , Giorgio Giacinto, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Roli</surname>
          </string-name>
          .
          <article-title>Evasion attacks against machine learning at test time</article-title>
          .
          <source>In Joint European conference on machine learning and knowledge discovery in databases</source>
          , pages
          <fpage>387</fpage>
          -
          <lpage>402</lpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[Breiman</source>
          , 2017]
          <string-name>
            <given-names>Leo</given-names>
            <surname>Breiman</surname>
          </string-name>
          .
          <article-title>Classification and regression trees</article-title>
          .
          <source>Routledge</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Calzavara et al.,
          <year>2019</year>
          ]
          <string-name>
            <given-names>Stefano</given-names>
            <surname>Calzavara</surname>
          </string-name>
          , Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, and
          <string-name>
            <given-names>Salvatore</given-names>
            <surname>Orlando</surname>
          </string-name>
          . Treant:
          <article-title>Training evasion-aware decision trees</article-title>
          .
          <source>arXiv preprint arXiv:1907.01197</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Drews et al.,
          <year>2019</year>
          ]
          <string-name>
            <given-names>Samuel</given-names>
            <surname>Drews</surname>
          </string-name>
          , Aws Albarghouthi, and
          <string-name>
            <surname>Loris D'Antoni</surname>
          </string-name>
          .
          <article-title>Proving data-poisoning robustness in decision trees</article-title>
          .
          <source>arXiv preprint arXiv:1912.00981</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>[Dua and Graff</source>
          , 2017]
          <string-name>
            <given-names>Dheeru</given-names>
            <surname>Dua</surname>
          </string-name>
          and
          <string-name>
            <given-names>Casey</given-names>
            <surname>Graff</surname>
          </string-name>
          .
          <source>UCI machine learning repository</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Laurent and Rivest</source>
          , 1976]
          <string-name>
            <given-names>Hyafil</given-names>
            <surname>Laurent and Ronald L Rivest</surname>
          </string-name>
          .
          <article-title>Constructing optimal binary decision trees is np-complete</article-title>
          .
          <source>Information processing letters</source>
          ,
          <volume>5</volume>
          (
          <issue>1</issue>
          ):
          <fpage>15</fpage>
          -
          <lpage>17</lpage>
          ,
          <year>1976</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Meadows et al.,
          <year>2005</year>
          ]
          <string-name>
            <given-names>Sandra</given-names>
            <surname>Meadows</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Karen</given-names>
            <surname>Baker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jeremy</given-names>
            <surname>Butler</surname>
          </string-name>
          .
          <article-title>The incident decision tree: guidelines for action following patient safety incidents</article-title>
          .
          <source>In Advances in Patient Safety: From Research to Implementation</source>
          (Volume
          <volume>4</volume>
          :
          <string-name>
            <surname>Programs</surname>
          </string-name>
          , Tools, and Products).
          <source>Agency for Healthcare Research and Quality (US)</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Murthy</source>
          , 1998]
          <article-title>Sreerama K Murthy</article-title>
          .
          <article-title>Automatic construction of decision trees from data: A multi-disciplinary survey</article-title>
          .
          <source>Data mining and knowledge discovery</source>
          ,
          <volume>2</volume>
          (
          <issue>4</issue>
          ):
          <fpage>345</fpage>
          -
          <lpage>389</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [Peled et al.,
          <year>1999</year>
          ]
          <article-title>Doron A</article-title>
          .
          <string-name>
            <surname>Peled</surname>
          </string-name>
          ,
          <string-name>
            <surname>Edmund M. Clarke</surname>
            , and
            <given-names>Orna</given-names>
          </string-name>
          <string-name>
            <surname>Grumberg</surname>
          </string-name>
          . Model Checking. MIT Press,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[Quinlan</source>
          ,
          <year>1986</year>
          ]
          <string-name>
            <given-names>J. Ross</given-names>
            <surname>Quinlan</surname>
          </string-name>
          .
          <article-title>Induction of decision trees</article-title>
          .
          <source>Machine learning</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <fpage>81</fpage>
          -
          <lpage>106</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>[Quinlan</source>
          ,
          <year>2007</year>
          ]
          <string-name>
            <given-names>JR</given-names>
            <surname>Quinlan</surname>
          </string-name>
          . C5. http://rulequest.com,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>[Quinlan</source>
          ,
          <year>2014</year>
          ]
          <string-name>
            <given-names>J Ross</given-names>
            <surname>Quinlan</surname>
          </string-name>
          .
          <source>C4</source>
          .
          <article-title>5: programs for machine learning</article-title>
          .
          <source>Elsevier</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>[Reason</source>
          , 2016]
          <string-name>
            <given-names>James</given-names>
            <surname>Reason</surname>
          </string-name>
          .
          <article-title>Managing the risks of organizational accidents</article-title>
          .
          <source>Routledge</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>[Song and Ying</source>
          , 2015]
          <article-title>Yan-Yan Song and LU Ying. Decision tree methods: applications for classification and prediction</article-title>
          . Shanghai archives of psychiatry,
          <volume>27</volume>
          (
          <issue>2</issue>
          ):
          <fpage>130</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>[SVA</source>
          ,
          <year>2018</year>
          ] SVA.
          <article-title>(systemverilog assertions)</article-title>
          .
          <source>IEEE Std 1800-2017</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>1315</lpage>
          ,
          <year>Feb 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <source>[Zwitter and Soklic</source>
          , 1988]
          <string-name>
            <given-names>Matjaz</given-names>
            <surname>Zwitter</surname>
          </string-name>
          and
          <string-name>
            <given-names>Milan</given-names>
            <surname>Soklic</surname>
          </string-name>
          .
          <source>Breast cancer data set</source>
          ,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>