<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving Abductive Diagnosis Through Structural Features: A Meta-Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Roxane Koitz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Franz Wotawa</string-name>
          <email>wotawag@ist.tugraz.at</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Authors are listed in alphabetical order</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>While abductive reasoning provides an intuitive approach to diagnosis, its computational complexity remains an obstacle. Even though certain model representations are tractable, computing solutions for instances of reasonable size and complexity persists to pose a challenge. Hence, the discovery of efficient methods to derive abductive explanations presents itself as appealing research area. In this paper, we investigate the structural properties inherent to formalizations suitable for abductive failure localization. Based on the features extracted we construct a meta-approach exploiting a machine learning classifier to predict the abductive reasoning technique yielding the “best” performance on a specific diagnosis scenario. To assess whether the proposed attributes are in fact sufficient for forecasting the appropriate abduction procedure and to evaluate the efficiency of our algorithm selection in comparison to traditional abductive reasoning approaches, we conducted an empirical experiment. The results obtained indicate that the trained model is capable of predicting the most efficient algorithm and further, we can show that the meta-approach is capable of outperforming each single abductive reasoning method investigated.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Being able to accurately identify the source of an unintended system
behavior is an essential objective in various application domains.
Abductive reasoning appears to be a natural approach to diagnosis as it
infers consistent explanations from background knowledge and
observed symptoms based on the notion of entailment. Usually a set
of constraints, such as minimality, are placed on whether a solution
suffices as an abductive explanation or not.</p>
      <p>
        While diagnosis is the most prevalent application area for
abductive reasoning, abduction has been applied to a diverse set of
problems such as planning [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], natural language processing [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], and
image interpretation [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ]. A large body of literature has investigated
approaches to mechanizing abductive reasoning such as consequence
finding [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], proof-tree completion [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], set-covering [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], abductive
model-based diagnosis [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] or abductive logic programming [
        <xref ref-type="bibr" rid="ref15 ref6">15, 6</xref>
        ].
      </p>
      <p>
        Within this paper we concentrate on two methods, namely
parsimonious set covering and abductive model-based diagnosis. The
set covering theory by Peng and Reggia [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] utilizes a causal
associative network recording the relations between disorders and their
manifestations. In their simple model, these cause and effect sets are
strictly disjoint. A diagnosis then is a set of disorders covering, i.e.
explaining, the set of observed symptoms. Later the approach has
been extended to incorporate probability theory and several
refinements to the basic theory have been proposed such as the
improvement of models with additional knowledge or the inclusion of more
complex covering relations [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        In model-based diagnosis a formalization of the system under
consideration is exploited to determine causes for anomalies [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. While
the traditional approach utilizes a representation of the correct
system behavior and derives diagnoses via inconsistencies, the abductive
variant operates on a model describing the way faults affect
measurable system variables. Through the notion of entailment abductive
model-based diagnosis reasons about causes of a set of observations
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Considering a restricted problem space, simple set covering and
abductive model-based diagnosis are equivalent [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>
        Even though there exist certain subsets of logics where abduction
is tractable, it generally is an NP-hard problem which grows
exponentially in the size of the model [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. Hence, within this paper we
investigate algorithm selection as a means to efficiently compute
abductive explanations in the context of diagnosis. First formalized by
Rice [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ], algorithm selection aims at identifying the “best
performing” approach for a specific problem instance. The basic building
blocks within this framework are a portfolio of algorithms to choose
from, empirical performance data of the algorithms on representative
problems and a set of features, which are used to get a notion of the
difficulty of a problem instance [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. On grounds of the empirical
data and the feature vector, a predictor can be trained capable of
determining the most suitable approach for a distinct sample from the
problem space [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Machine learning has been identified as a
feasible approach to use as a prediction tool. Leyton-Brown et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
describe their portfolio approach to algorithm selection, where they
train an empirical hardness model for each algorithm within their
portfolio to forecast each approach’s computation time on the
instance and execute the one predicted as most efficient. Algorithm
selection has been applied very successfully in the domain of SAT
[
        <xref ref-type="bibr" rid="ref38">38</xref>
        ], graph coloring [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], or tree-decomposition [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
      </p>
      <p>In this paper, we restrict the problem space to propositional Horn
clause models which can be automatically generated from failure
assessments available in practice. These analyses hold information
on faults and their effects and thus are suitable knowledge sources
for abductive diagnosis. The resulting logical system descriptions
are characterized by certain structural properties, which we utilize
as features for the algorithm selection process. We extracted these
attributes for a collection of instances and evaluated several
abductive diagnosis algorithms empirically on their computation time for
the entire sample set. On basis of the performance data and the
features we trained a machine learning classifier to forecast the
algorithm most suitable in regard to its runtime for a particular
abductive diagnosis scenario. We embedded the selection process within
a meta-algorithm, which generates the structural metrics for a given
diagnosis problem, categorizes it on the previously trained classifier
and computes the diagnoses using the algorithm chosen by the
predictor.</p>
      <p>We organize our paper as follows. The next section provides a
formal introduction to abductive model-based diagnosis as well as the
parsimonious set covering approach. Further, we show that in their
simplest form they are equivalent. Section 3 discusses the type of
logical formalizations we examine and describes the structural
characteristics forming our features for the algorithm selection.
Subsequently, we discuss the meta-approach in more detail and in Section
4.3 we empirically evaluate it in comparison to the algorithms in our
portfolio. Lastly we summarize the paper and provide an outlook to
future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Abductive Diagnosis</title>
      <p>
        Within this section we consider two abductive diagnosis approaches,
namely abductive model-based diagnosis [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] operating on
propositional Horn clauses and the simple set covering approach [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ].
Specifically, we show their equivalence and describe the
corresponding elements within each method.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Model-Based Diagnosis</title>
      <p>
        As model-based diagnosis requires a formal description of the
system, its abductive variant utilizes a representation of the connections
between failures and their manifestations. Based on the information
available, the task is to search for a set of consistent causes which
together with the background theory logically entail a set of observed
fault indicator. Since abduction is a hard problem, research has
focused on subsets of logics which allow to compute explanations in
polynomial time [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. An important restriction on the underlying
formulas is the Horn property as for this fragment abduction is still
tractable. Within the context of diagnosis a further syntactical
restriction often imposed is a definite Horn theory since it often suffices to
describe causal relations [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Therefore, we concentrate on propositional definite Horn
descriptions and define in a similar manner as Friedrich et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] a
knowledge base.
      </p>
      <sec id="sec-3-1">
        <title>Definition 1 (Knowledge base (KB)) A knowledge base (KB) is a</title>
        <p>tuple (A,Hyp,Th) where A denotes the set of propositional variables,</p>
        <sec id="sec-3-1-1">
          <title>Hyp A the set of hypotheses, and Th a set of Horn clause sentences over A.</title>
          <p>The set of hypotheses Hyp comprises the propositional variables
allowed to form a diagnosis, while the theory describes the relations
between the variables. In this context, we further specify the
propositional variables not constituting a hypothesis, i.e. fA n Hypg, as
effects or symptoms. We will refer to this set of variables as within
this paper. Since we aim at identifying root causes for failure
manifestations, an abduction problem considers a knowledge base KB as
well as a set of symptoms to explain. We therefore define a
Propositional Horn Clause Abduction Problem (PHCAP) as follows:
Definition 2 (Propositional Horn Clause Abduction Problem
(PHCAP)) Given a knowledge base (A,Hyp,Th) and a set of
observations Obs A then the tuple (A,Hyp,Th,Obs) forms a Propositional</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Horn Clause Abduction Problem (PHCAP).</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Definition 3 (Diagnosis; Solution of a PHCAP) Given a PHCAP</title>
        <p>(A,Hyp,Th,Obs). A set Hyp is a solution if and only if [ Th
j= Obs and [ Th 6j= ?. A solution is parsimonious or minimal
if and only if no set 0 is a solution.</p>
        <p>A solution to a PHCAP is an abductive diagnosis, as it provides
hypotheses consistently explaining the occurrence of a set of
observations. As in practice minimal solutions are preferred, we require the
diagnoses to be subset minimal.</p>
        <p>Example 1: Consider the following KB:</p>
        <p>A = fh1; h2; h3; o1; o2; o3g; Hyp = fh1; h2; h3g;
T h =</p>
        <p>h1 ! o1; h2 ! o1; h2 ! o2; h3 ! o2; h3 ! o3
Assume we can observe o1 and o3, i.e. Obs = fo1; o3g. The
solutions to the PHCAP, i.e. the minimal abductive diagnoses, are
1 = fh1; h3g and 2 = fh2; h3g.
2.2</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Parsimonious Set Covering</title>
      <p>
        Abduction by parsimonious set covering is based in its simplest form
on an associative network encompassing the causal links between
possible disorders and their manifestations [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. A diagnosis
problem is a 4-tuple P =&lt; D; M; C; M + &gt;, where D is the set of
disorders, M comprises the manifestations, C defines the causal
connections, and M + represents the current set of symptoms
observed. The knowledge about the causal relations is defined by two
sets: effects(di) and causes(mj ). For each disorder di we can
define effects(di) = fmj j &lt; di; mj &gt;2 Cg as the set of
manifestations cause by the disorder. Similarly, the set causes(mj ) = fdij &lt;
di; mj &gt;2 Cg holds the disorders which directly trigger
manifestation mj [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. Thus, for any subset of disorders DI , we can determine
the objects directly caused by it as
effects(DI ) =
effects(di)
Along similar lines, we can observe that
[
di2DI
[
mj 2MJ
causes(MJ ) =
causes(mj )
      </p>
      <p>
        As mentioned within this approach abductive explanations are
defined as the causes covering the observed symptoms. A set of
disorders DI is said to cover a set of manifestations MJ M if
MJ effects(DI ), i.e. the former causally infers the latter. While
minimality is not a necessary condition for a cover in the original
definition of Peng and Reggia [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], we introduce the further
requirement that the cover is subset minimal.
      </p>
      <sec id="sec-4-1">
        <title>Definition 4 (Cover) A set DI D is said to cover MJ</title>
        <p>if MJ effects(DI ) and there exists no DI0 DI with MJ
effects(DI0 ).</p>
        <p>M
Thus, we can define a solution to a set covering problem as a subset
DI D covering M +.</p>
        <sec id="sec-4-1-1">
          <title>Definition 5 (Set Cover Diagnosis) Given a diagnosis problem P .</title>
          <p>A set D is said to be a diagnosis iff covers M +.</p>
          <p>
            In regard to the logic-based definitions discussed for model-based
diagnosis, the disorders refer to the set Hyp in the PHCAP
framework. Their manifestations constitute the effects , M + corresponds
to Obs, and the network represents the domain theory. A causal
relation &lt; di; mj &gt; is recorded in C whenever there is a logical
implication of the form di ! mj , where di 2 Hyp and mj 2
within the theory T h. Thus, it is apparent that the simple set
covering framework is equivalent to logic-based abduction with a theory
restricted to definite Horn clauses [
            <xref ref-type="bibr" rid="ref23">23</xref>
            ]. As both methods generate the
explanatory causes based on the relationships between disorders and
effects, they compute abductive explanations. That is a set covering
diagnosis, as defined previously, corresponds to a minimal diagnosis
in the PHCAP framework.
          </p>
          <p>Example 1 (cont): Considering our previous example, the diagnosis
problem P can be reformalized in set covering:</p>
          <p>D = fh1; h2; h3g, M = fo1; o2; o3g; M + = fo1; o3g;
C =</p>
          <p>&lt; h1; o1 &gt;; &lt; h2; o1 &gt;;
&lt; h2; o2 &gt;; &lt; h3; o2 &gt;; &lt; h3; o3 &gt;</p>
          <p>We can obtain the set covering diagnoses by determining
the disorder sets DI where effects(DI ) cover M +, which
are effects(fh1; h3g) = fo1; o2; o3g and effects(fh2; h3g) =
fo1; o2; o3g. Hence, the diagnoses are 1 = fh1; h3g and 2 =
fh2; h3g.</p>
          <p>
            The equivalence between set covering and the hitting set
problem has been established [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ], thus we can exploit the notion of
hitting sets in order to define a diagnosis within the parsimonious
set covering theory. In particular, we stated previously that a cover
implies a causal dependency between disorders and manifestations
and can be expressed through the effects relation. Along similar lines
causes(mj ) comprises information on all disorders responsible for
mj . Hence, by computing the hitting set of causes(mj ) for a
single manifestation, we derive a disjunction of all disorders possibly
leading to mj , i.e. each disorder constitutes a diagnosis. For
multiple observations, i.e. m1; m2; : : : ; mn 2 M +, the hitting sets of all
causes-sets of the current manifestations form the diagnoses. This
is apparent since in order to represent a solution one disorder
explaining each observation has to be present within each diagnosis.
To impose the parsimonious criteria, we restrict solutions to subset
minimal hitting sets [
            <xref ref-type="bibr" rid="ref30">30</xref>
            ].
          </p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Definition 6 (Abductive Hitting Set Diagnosis) Given a diagnosis</title>
          <p>problem P . A set D is said to be a minimal diagnosis iff is a
minimal hitting set of S, where 8mj 2 M + : causes(mj ) 2 S.
Example 1 (cont): The causes sets for the current manifestations are
causes(o1) = fh1; h2g and causes(o3) = fh3g, thus causes(o1) 2
S and causes(o3) 2 S. The minimal hitting set of S correspond to
1 = fh1; h3g and 2 = fh2; h3g.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Models</title>
      <p>
        An essential issue in model-based diagnosis has been the
construction of system descriptions suitable for identifying faults. Thus,
numerous techniques to automatically extract models have been
proposed with a recent method taking advantage of Failure Mode Effect
Analysis (FMEA) records [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]. This type of risk evaluation is
becoming increasingly common and collects data on how faults on a
component level influence system variables [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Table 1 depicts a
simplified example of an FMEA where each row contains a
component, a possible fault of said component and its corresponding
effects. Since it captures the causal associations between defects and
their consequences it provides information necessary for abductive
reasoning.
      </p>
      <p>
        In the straightforward mapping presented by Wotawa [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ] each
record of the FMEA is transformed into a set of Horn clause
sentences, where the component-fault mode pair implies an effect. These
formulas form the theory T h of a KB which we can utilize to
compute abductive diagnoses based on a set of fault indicators. The
      </p>
      <p>8 mode(F an; Corrosion) ! P turbine; 9
T h = &lt;&gt;&gt;&gt;&gt; mmooddee((FFaann;; TTMMFF)) !! PT tcuarbibnineet;; &gt;&gt;&gt;=&gt;
&gt;&gt;&gt;:&gt; mmooddee((IIGGBBTT;;HHCCFF)) !! TT cnaabcienlelte; &gt;&gt;&gt;&gt;;
As is apparent from the example, the result of the mapping is a model
consisting of bijunctive definite Horn clauses. Thus, we can simply
apply either model-based or set covering abduction to compute
diagnoses. However, since we would like to extend our approach to
different failure assessments in practice, we allow more expressive
formalizations, i.e. conjunctions of causes and disjunction of
manifestations. In order to use these models within the PHCAP and simple
set covering approach, we currently compile them into define Horn
clauses. It is apparent that this increases the theory’s cardinality,
requires to reassemble the original hypotheses at the end of the
diagnosis and to ensure that subset minimality is still present.</p>
      <p>Now we explain how we can utilize the logical models obtained
on basis of the FMEAs to perform abductive reasoning within the set
cover approach. Given a model of this type we can easily represent
it as a hypergraph H = (V; E), where V is the set of vertices and E
constitutes the set of hyperedges. The nodes of the hypergraph
represent the propositional variables, while the hyperedges are determined
by the theory. For each clause there exists a hyperedge containing all
propositional variables of said clause, i.e. 8a 2 A ! a 2 V and
8c 2 T h ! Sl2c jlj 2 E where jj is a function mapping literals
to the underlying propositions ignoring negations, i.e., j:pj = p and
jpj = p for all p 2 A. In Figure 1 on the right hand side a hypergraph
representation of Example 1 is shown.</p>
      <p>Following this representation we can assign a label to each vertex
within a hyperedge E, such that:</p>
      <p>8
label(v) = &lt;
fvg</p>
      <p>S
:x2E^x6=v</p>
      <p>if v 2 Hyp
label(x) otherwise</p>
      <p>
        In case a vertex represents a manifestation, its label correspond to
its causes-set, as it holds the hypotheses responsible for the effect.
Thus, we can utilize the labels of the nodes representing the
observations to compute the abductive diagnoses as hitting sets. Note that by
relying on this notion we could further handle intermediate effects,
which we do not discuss in more detail in this paper.
As stated the models we are considering are bijunctive definite Horn
clauses. Thus, we can easily define their structural properties based
on various graph representations. In the simplest case the theory is
characterized as a directed acyclic graph (DAG) with two disjunctive
node sets, namely the propositional variables constituting the causes
and the effects, respectively. This representation is equivalent to the
associative network as described by Peng and Reggia [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] in their set
covering approach. Furthermore, it is apparent that by constructing
an undirected graph we receive a bipartite graph.
      </p>
      <p>Considering the graph representations of the model, we can
extract certain characteristics of their structure which we subsequently
use within the algorithm selection process. As abductive diagnosis
is possibly exponential in the number of causes to consider, the
cardinality of Hyp is an intuitive measure complexity. In addition, we
collect the number of effects and connections within the theory.
3.2.1</p>
      <p>Outdegree and Indegree
Based on the DAG, we can compute for each vertice representing a
hypothesis its outdegree, which specifies the number of
manifestations affected by said cause. Similarly, we measure the indegree of
each effect, i.e. the number of hypotheses inferring the manifestation.
Considering the set covering framework we can define the degrees as
follows:
outdegree(hi) = jeffects(hi)j
indegree(mj ) = jcauses(mj )j</p>
      <p>In regard to Example 1 we can observe outdegree(h2) =
jeffects(h2)j = 2 and indegree(o3) = jcauses(o3)j = 1. Collected
over the entire model these measures provide an intuitive metric of
the basic magnitude of the theory and the connectedness of the graph.
3.2.2</p>
      <p>Covering and Overlap
Several disorders may cover the same effect, i.e. a manifestation can
be explained by multiple causes. On basis of this we can define a
covering metric for each pair of hypotheses as the ratio between the
number of common effects and the total number of symptoms
induced by the hypotheses:
covering(hi; hj ) = jeffects(hi) \ effects(hj )j</p>
      <p>jeffects(hi) [ effects(hj )j</p>
      <p>
        The overlap of o1 and o2 at h2 is shown as a red oval on the
left side of the DAG in Figure 1. In turn we can compute
overlap(o1; o3) = 0. Peng and Reggia [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] define a pathognomonic
effect as an observation with a single cause. Thus, whenever a
pathognomonic symptom is involved, we do not compute an overlap
relation. By collecting these measures for any pair of hypotheses or
effects, we can compute a value over the entire model.
Whenever there exist several subproblems in our theory we refer to
them as independent diagnosis subproblems. If several subproblems
exist, the graphs representing the model are disconnected and each
independent diagnosis subproblem itself is a connected subgraph.
In case all effects are pathognomonic, then each cause-effect
relation represents its own independent diagnosis subproblem and thus
we can observe that the model is orthogonal. Imagine the clause
h2 ! o2 missing from the theory of Example 1. In this case we
would have two independent diagnosis subproblems, namely one
including h1; h2 and o1 and the other one consisting of h3; o2 and o3.
As an additional measure to the number of subproblems we further
compute the average size over all independent diagnosis
subproblems in case several exist.
Another measure of connectedness within the model is the minimal
path length between any two nodes on the hypergraph. In particular,
we measure the length of the minimal path between nodes
representing hypotheses, thus we compute the minimal number of hyperedges
to be traversed between each pair of hypothesis vertices. Note that for
a single model there are possibly several hypergraphs depending on
the number of independent diagnosis subproblems, thus we disregard
paths between nodes belonging to different subproblems.
Considering the hypergraph in Figure 1, we can observe path(h1; h2) = 2.
The clustering coefficient is a known measure of node clusters within
a graph. It is evident that we cannot compute a clustering coefficient
from the graph representations used so far, i.e. the DAG, bipartite
graph and hypergraph, due to the two disjoint node classes.
Therefore, we transform the bipartite graph by projection [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In
particular, we remove all nodes corresponding to manifestations and link
two cause vertices vhi and vhk whenever they imply the same
effect, i.e. effects(hi)\ effects(hk) 6= ;. Based on the resulting
undirected graph featuring only the nodes corresponding to
hypotheses, we compute for each node the local clustering coefficient as
c = ki(k2in 1) , where n is the number of neighbors of the node and
ki the number of edges between the neighbors of n. While in
network analysis the projection of bipartite graphs results in coefficients
differentiating from typical one-mode networks, this does not pose
an issue in our case as we are solely interested in the models in our
problem space. Thus, the clustering coefficient provides for our
models another measure of covering between hypotheses.
3.2.6
A simple encoding-based measure on a graph is its Kolmogorov
complexity, which defines a value equal to the length of the word
necessary to encode the graph. A straightforward manner in this
context is to compute the complexity based on the adjacency matrix of
the undirected graph [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
3.2.7
      </p>
      <p>Observation Dependent Metrics
Since not only the topology of the model is of interest, but also the
structure of the current diagnosis problem, we measure the indegree
and the overlap among the elements of Obs as well as the number
of diagnosis subproblems involving variables of Obs, in case several
exist.
4</p>
    </sec>
    <sec id="sec-6">
      <title>Meta-Approach</title>
      <p>
        Algorithm selection aims at identifying the most appropriate method
out of a portfolio of techniques for a given problem instance in
regard to its performance [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. Performance in this context is most
commonly associated with the computation time but could also
refer to accuracy or simplicity. The model as described by Rice [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]
advocates for the use of features inherent to the problems within
the problem space in order to accurately map a new sample to its
most effective or efficient algorithm. This mapping is based on
empirical performance data on representative samples of the approaches
present in the algorithm space [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. On basis of the features extracted
and the execution records, a predictor can be trained which can
determine aspects of the problem influencing the performance of an
algorithm. Thereby each problem can be described by a set of attributes
which together with execution data allows a predictor to forecast the
most valuable algorithm on an instance. Machine learning has been
identified as a feasible approach to use as a prediction tool.
      </p>
      <p>
        Generally, there are two possible objectives; either one algorithm
of the portfolio is to be selected based on a single predictor or for
each approach within the portfolio the performance metric should be
determined as a basis of the selection. The latter requires a distinct
empirical hardness model for each method within the portfolio and
thus whenever the algorithm for a new instance is to be selected,
for each approach a prediction has to be made [
        <xref ref-type="bibr" rid="ref13 ref17">13, 17</xref>
        ]. SATzilla
[
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] is an example of such a portfolio approach within the domain of
SAT solvers. For our meta-technique, however, we consider the first
variant, where we train a single classifier for all abductive reasoning
methods to select a single approach for execution.
      </p>
      <p>
        We consider a 1-of n portfolio [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ], where there are n algorithms
to choose from but only one is selected and executed to solve the
diagnosis problem. Within the context of diagnosis our meta-approach
works the following way: As mentioned the foundation of
modelbased diagnosis is a description of the system to be diagnosed. Thus,
the majority of the features can be computed offline on the diagnosis
models present. Further, within this phase the empirical data on
computation times of the various abductive reasoning approaches can be
collected and on basis of the metrics and the runtime information a
machine learning classifier is trained. Whenever the diagnosis
process is triggered by a detected anomaly, we retrieve our previously
learned machine learning classifier as well as the offline determined
metrics of the diagnosis model. Algorithm 1 describes the online
portion of the meta-approach, which is executed whenever new
diagnoses are to be computed. Online we have to collect the current
PHCAP’s instance-based features such as jObsj or the number of
independent diagnosis subproblems comprising the current
observations. Based on the online and offline generated attributes we supply
the feature vector with the measurements of the current
diagnosis problem. By providing all features to the machine learning
algorithm, we in turn retrieve a predicted best abduction method out of
our portfolio for this specific scenario based on the trained
classifier and the instance’s features. Subsequently, we can instantiate the
diagnosis engine with the corresponding abduction method as well
as diagnosis problem and compute the set of abductive explanations,
i.e. Set.
      </p>
      <p>In the remainder of this section we describe first our portfolio
which currently includes five abductive diagnosis methods and
second we list the metrics we have used within the meta-approach.
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>Portfolio</title>
      <p>
        We employ a 1-of 5 portfolio, i.e. we select one approach from the
static algorithm space containing five methods which can be utilized
for abductive reasoning based on a propositional logic model. For
each technique, we give a brief description of the underlying notion.
In particular, we utilize an Assumption-Based Truth Maintenance
Systems (ATMS) [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ] as a general abduction engine for
propositional Horn clauses [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Besides the ATMS our portfolio holds
various hitting set algorithms, which are capable of computing minimal
diagnoses as shown in Section 2.2. Thus, we simply compute for each
oi 2 Obs the set causes(oi), store them in the set S and derive the
minimal hitting sets for S. The hitting set routines we included in our
meta-approach are the following: Binary Hitting Set Tree (BHSTree)
[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], HS-DAG [
        <xref ref-type="bibr" rid="ref10 ref34">34, 10</xref>
        ], HST [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ], and Berge’s algorithm [
        <xref ref-type="bibr" rid="ref2 ref8">2, 8</xref>
        ].
4.1.1
      </p>
      <p>ATMS
The ATMS operates on a graph representation of the logical theory,
where propositional variables are represented as nodes and the
relations within the theory determine the directed edges. By utilizing
a label for each node, the ATMS determines the subset minimal set
of hypotheses implying each vertex and thereby allows to directly
record abductive explanations. Furthermore, by recording
contradictions it retains consistency. In order to generate the diagnoses for a
given PHCAP, a clause is added such that o1 ^ o2 ^ : : : ^ on ! obs,
where o1; o2; : : : ; on 2 Obs and obs 62 A. The label of obs then
contains the solution to the PHCAP.
4.1.2</p>
      <p>
        HS-DAG
Reiter’s [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] minimal hitting set approach exploits the structure of
a tree. To compute the hitting sets an initial set out of S is the
dedicated root node. The tree is then iteratively extended in a breadth
first manner, where each node n is labeled by a set s 2 S in case s
is disjoint to the set of edge labels of the current path. If this is the
case an outgoing edge h(n) is generated for each 2 s and after
all s 2 S have been processed each leaf represents a hitting set. To
ensure minimality various techniques on pruning the tree have been
developed. Some inadequacies of Reiter’s algorithm were corrected
by Greiner et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and they further devised their HS-DAG as a
version of Reiter’s approach performed on a DAG instead of a tree.
4.1.3
      </p>
      <p>HST
The HST variant of HS-DAG operates on a tree instead of a graph
and avoids the construction of unnecessary nodes and costly subset
Algorithm 1 MetAB
procedure METAB (A; Hyp; T h; Obs)
m retrieveClassifier()
offline retrieveMetrics(A; Hyp; T h)
online computeMetrics(A; Hyp; T h; Obs)
= offline [ online
algorithm predict( ; m)</p>
      <p>
        Set diagnose(algorithm; A; Hyp; T h; Obs)
return Set
end procedure
checks [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. Based on an ordered list of the elements within S the
algorithm limits the number of outgoing edges for each node to a
specific range within the ordered list. By checking the current path
and the tree already constructed, the algorithm decides to whether
a node has to be generated or the corresponding hitting set will be
constructed later during the computation.
4.1.4
      </p>
      <p>
        BHSTree
Lin and Jiang [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] propose the Binary Hitting Set Tree. First the
tree is constructed by splitting input sets on particular elements and
recursively adapting the sets and building the tree. During the bottom
up traversal the hitting sets are constructed by merging the data of
the child nodes. The minimization is performed by a minimization
function , which is not specified in detail in their paper.
4.1.5
      </p>
      <p>
        Berge’s Algorithm
This minimal hitting set algorithm, sometimes referred to as Berge’s
algorithm, uses an intuitive notion to compute hitting sets [
        <xref ref-type="bibr" rid="ref28 ref8">8, 28</xref>
        ].
By definition a hitting set intersected with any set of S is not empty,
i.e. each set of S has to contribute to each hitting set. In Berge’s
algorithm the minimal hitting sets are constructed and modified
incrementally. Initially, the set of hitting sets H is empty. Whenever a
new s 2 S is to consider, each 2 H is checked whether s \ = ;.
In case it is not, i.e. already hits s, remains unchanged, otherwise
it is removed from H and for each 2 s a new set is created
containing and . By checking whether subsets are present within H
the algorithm ensures to derive minimal hitting sets.
4.2
      </p>
    </sec>
    <sec id="sec-8">
      <title>Features</title>
      <p>In Section 3 we discussed the metrics we extract from our logical
problem instances. Here we simply list the properties we compute in
our meta-approach.
1. Logic model specific</p>
      <p>Number of hypotheses
Number of effects
Number of causal relations, i.e. clauses in the theory
Number of independent diagnosis subproblems</p>
      <p>Average size of independent diagnosis subproblems
2. DAG</p>
      <p>Outdegree of hypothesis nodes (maximum, average, standard
deviation)
Indegree of effect nodes (maximum, average, standard
deviation)
. Retrieves trained model
. Retrieves the previously computed model metrics</p>
      <p>. Computes the instance-based features
. Forecasts the best performing algorithm for the diagnosis problem
. Computes diagnoses based on the predicted algorithm for the PHCAP
Covering (maximum, average, standard deviation)</p>
      <p>Overlap (maximum, average, standard deviation)
3. Undirected graph
4. Hypergraph</p>
      <p>Kolmogorov complexity based on adjacency matrix
Local clustering coefficient (maximum, average, standard
deviation)3</p>
      <p>Path length (maximum, average, standard deviation)4
5. Instance specific/Observation dependent</p>
      <p>Number of observations
Indegree current observation nodes (maximum, average,
standard deviation)
Overlap current observation (maximum, average, standard
deviation)
Number of independent diagnosis subproblems including
current observations</p>
      <p>
        While Hutter et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] state that the feature extraction method
should be efficient, in our framework only the computation of a
subset of these attributes has to be performed online, namely the
computation of the instance specific metrics. The creation of the diagnosis
models, the computation of the basic model features, i.e. feature sets
1 to 4, and the training of the classifier are performed offline. Thus,
online the observation specific metrics have to be extracted, i.e.
feature set 5, the algorithm appropriate for the problem instance has to
be predicted and the diagnoses have to be calculated.
4.3
      </p>
    </sec>
    <sec id="sec-9">
      <title>Empirical Evaluation</title>
      <p>In this section we evaluate the feasibility of our meta-approach. On
the one hand we assess the quality of the model properties to train
a machine learning classifier capable of predicting the most efficient
algorithm out of the portfolio in regard to its runtime for a specific
PHCAP instance. On the other hand, we examine the efficiency of
the meta-approach overall in comparison to the abductive reasoning
algorithms in the portfolio.</p>
      <p>
        Our meta-approach itself is implemented in Java as well as the
ATMS engine, BHSTree5 and Berge’s algorithm. For the remaining
hitting set algorithms, i.e. HS-DAG and HST, we exploit PyMBD6
[
        <xref ref-type="bibr" rid="ref33">33</xref>
        ], a publicly available python library of minimal hitting set
algorithms. To create a predictor based on the features we utilize the
3 Based on the projection of the undirected graph only containing hypothesis
nodes.
4 Path length between hypothesis vertices.
5 http://www.ist.tugraz.at/modremas/index.html
6 http://modiaforted.ist.tugraz.at/downloads/pymbd.zip
Waikato Environment for Knowledge Analysis (WEKA) library [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
which provides a vast variety of classification algorithms. The
experiment was conducted on a Mac Pro (Late 2013) with a 2.7 GHz
12-Core Intel Xeon ES processor and 64GB of RAM running OS X
10.10.5.
4.3.1
      </p>
      <p>Data
In order to evaluate the meta-algorithm we generated a test suite of
artificial samples. Note here that while there are real-world
samples based on FMEAs, we do not incorporate them in this
evaluation as the size of the practical models is rather limited. The
synthetic samples obtained differ in their number of hypotheses, effects,
connections and the covering and overlap between causes and
manifestations, respectively. Conjunctions of causes and disjunctions of
manifestations were created randomly within the samples and it was
ensured that instances with various independent diagnosis problems
were present in the test suite. A total of 195 samples were created
with a varying number of hypotheses (12 jHypj 3120),
effects (1 jM j 5000), clauses (12 jT hj 5100) and
(1 jObsj 30). With each experiment run we collected the 31
metrics described in the previous section to build our feature vector7
for the classification. For each problem instance we executed our
abductive reasoning algorithms and recorded the most efficient
algorithm on each problem instance. To evaluate the classification based
on the features we randomly split the test suite into a training set
comprising 80% of the data and a test set holding 20% of the
examples. We standardized our data before performing any classification.
4.3.2</p>
      <p>Results
Before selecting the classification method, we performed cross
validation on several classification algorithms available in WEKA on
the training data. Based on the accuracy obtained we decided to use
WEKA’s implementation of a general algorithm for locally weighted
learning LWL. While within this experiment we did not compare
different classifiers besides the initial informal evaluation, we do
believe that examining various machine learning algorithms will be of
interest in future research on this matter. LWL performs prediction
by building a local model based on the weighted neighborhood data
of the attribute of interest. In our case the this attribute is nominal
and simply corresponds to the algorithms name. In regard to the
parameters we utilized a brute force nearest neighbor search algorithm,
included all neighbors in the linear weighting function and an
entropy based classifier.</p>
      <p>
        The classification utilizing LWL and based on the metrics reaches
a satisfactory success rate of 71.79% (MAE=0.22, RMSE=0.31)
correctly predicted instances, i.e. the selected algorithm was in fact the
most efficient on the problem, on the test set. The confusion
matrix in Table 2 shows the number of correctly and wrongly classified
instances. From the contingency table it is apparent that within the
test set Berge’s algorithm was the dominant approach, thus, our
predictor classified all but one instance as Berge’s algorithm. A known
limitation of this type of algorithm, selection where simply a single
approach is chosen and executed on the instance, is that in case the
prediction is incorrect the meta-approach might be rather inefficient
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. It is to note that in case our classifier chose a slower approach,
the selected algorithm was the second fastest within the portfolio. On
7 Note that the feature vector itself holds 32 values, as one is the nominal
value corresponding to the algorithm’s name. This last feature is to be
predicted.
the test set the meta-approach was on average 1.57% slower than an
optimal algorithm selection (MetAB Opt), i.e. the predictor would
classify every instance correctly, would have been.
      </p>
      <p>10000
1000
]
sm 100
n 
i[e 
im 10
t
n
u
goR  1
L
0.1
0.01</p>
      <p>We explored WEKA’s attribute selection in order to determine
whether we could remove certain features while achieving the same
prediction accuracy. Utilizing the meta-classifier with the LWL
classifier, we examined various selection approaches on the training data
and could diminish the set of features significantly from 32 to around
four8. The number and composition of the reduced attribute set
depends highly on the performed selection process. For example
utilizing the OneR evaluator and limiting the number of features of
the reduced set to three, we receive the attribute set consisting of
the number of current diagnosis problems , the number of
observations and number of effects of the entire model. Attribute selection on
grounds of the information gain results in number of current
diagnosis problems, the number of observations, the average path length on
the hypergraph and its standard deviation. Utilizing the SVM-based
reduction, we receive the number of observations, the standard
deviation of the indegree of the nodes representing the observations, the
average current covering relation and its standard deviation as well
as as the average of the covering relation over the entire model. As
can be seen the size of Obs plays an essential role in predicting the
preferable algorithm. In regard to the remaining selected properties
they provide information on the PHCAP, i.e. the current
observations, and various metrics on how hypotheses are connected through
effects. An in depth analysis of a reduced feature reduction would be
a topic of further research.</p>
      <p>HST
0
0
0
0
1
1</p>
      <p>Total
27
0
2
8
1
39</p>
      <p>Figure 2 shows the distribution of the log runtime data for the
8 The feature of the most efficient algorithm remains of course within the set
after selection and therefore accounts for one feature in the reduced vector.
0.1
10000
]
1snm 000
i[e 100
m
it
gunR  10
Lo 1
0.1
0.01
Berge</p>
      <p>ATMS
test set. In case of our artificial examples the meta-approach MetAB
(M=0.83 ms, SD =1.54 ms) performs well, i.e. is on average the
most efficient. On average its is 99% faster than HST (M=82.78 ms,
SD=455.65 ms), 94.86% more efficient than the ATMS (M=16.16
ms, SD=76.76 ms), around 85% faster than BHSTree (M=5.57 ms,
SD=30.43 ms) and Berge’s algorithm (M=5.9 ms, SD=32.72 ms),
and still computes diagnoses around 58.27% faster than the HS-DAG
(M=1.99 ms, SD=4.5 ms).</p>
      <p>The overall runtime for the meta algorithm is determined by (1)
the computation of the online metrics, (2) the time it takes to create
the feature vector, supply it to the classifier and predicting the best
algorithm and (3) the diagnosis time of the suggested abduction
procedure. In regard to the feasibility of the meta-approach overall in
comparison to other abductive diagnosis methods, we like to refer
back to a particular characteristic of model-based diagnosis, namely
the availability of a system description offline. As the model has to
be present before the computation of the diagnoses, it allows us to
extract most of the metrics utilized in the algorithm selection offline.
Thus, the online computation of the features which are inherent to the
specific instance of the PHCAP is negligible (&lt; 0:1 ms). The
prediction of the algorithm for a single instance for the diagnosis models we
investigated was insignificant. The third factor, the diagnosis time, is
much dependent on the predictive capabilities of the classifier.</p>
      <p>A premature analysis of the results of the test data would
suggest that applying Berge’s method to every instance would yield
the optimal runtime for most problems. However, from the
cumulative log runtimes Figure 3a we can observe that based on
the entire set of problems, i.e. test and training, Berge is not the
most efficient approach as on several instances its computation time
is notably larger than of other algorithms. On the entire sample
we observe that only considering the algorithms from the
portfolio, HS-DAG is on average the best performing approach (M=5.5
ms, SD=32.31 ms) followed by Berge’s algorithm (M=15.62 ms,
SD=68.39 ms) and BHSTree (M=16.88 ms,SD=76.6 ms), while
the ATMS (M=101.37ms, SD=563.85 ms) still outperforms HST
(M=8968.04ms, SD=70873.65ms). Furthermore, based on the
prediction accuracy on the test set, we have created a mock
metaapproach (MetAB 71.79%) with 71.79% accuracy as we have
experienced on the test data. Thus, in 71.79% of the samples we recorded
for this approach the optimal time and for the remaining 28.21% the
second fastest time. We choose the instances with the slower
runtimes randomly. This mock meta-approach would still outperform,
the other algorithms in the portfolio. In Figure 3b we have depicted
the distribution of the log runtimes of the various approaches.
5</p>
    </sec>
    <sec id="sec-10">
      <title>Conclusion</title>
      <p>Even though abduction is an intuitive approach to diagnosis, its
computational complexity remains a disadvantage. Since the complexity
is inherent to the underlying diagnosis problem, we investigate
algorithm selection as a method to predict the most efficient abduction
approach for a particular instance. An essential part within algorithm
selection is the exploration of characteristics of the problems which
contribute to the computational effort. Hence, we consider various
metrics characterizing the type of models generated from failure
assessments available in practice. An advantage of this approach in the
context of abductive diagnosis is that the majority of these features
can be collected offline. Based on the attributes we form a feature
vector for a machine learning classifier as part of a meta-approach.
The empirical evaluation showed that the extracted properties of the
instances allow to determine the “best” abduction method. Even in
cases where the classification is incorrect, the approach selects the
second most efficient algorithm and thus overall outperforms the
other diagnosis methods. Therefore, we believe that this meta
approach is a feasible alternative to continuously using a single
abduction procedure. Despite the satisfactory classification results, we plan
on further extending the set of problem metrics, explore the
capabilities of various machine learning classifiers as well as determine
different attribute combinations yielding a more accurate classification
result.</p>
    </sec>
    <sec id="sec-11">
      <title>ACKNOWLEDGEMENTS</title>
      <p>The work presented in this paper has been supported by the FFG
project Applied Model Based Reasoning (AMOR) under grant
842407. We would further like to express our gratitude to our
industrial partner, Uptime Engineering GmbH.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Joachim</given-names>
            <surname>Baumeister</surname>
          </string-name>
          and Dietmar Seipel, '
          <article-title>Diagnostic reasoning with multilevel set-covering models'</article-title>
          ,
          <source>Technical report, DTIC Document</source>
          , (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Claude</given-names>
            <surname>Berge</surname>
          </string-name>
          . Hypergraphs, volume
          <volume>45</volume>
          of north-holland mathematical library,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Luca</given-names>
            <surname>Console</surname>
          </string-name>
          , Daniele Theseider Dupre, and Pietro Torasso, '
          <article-title>On the Relationship Between Abduction and Deduction'</article-title>
          ,
          <source>Journal of Logic and Computation</source>
          ,
          <volume>1</volume>
          (
          <issue>5</issue>
          ),
          <fpage>661</fpage>
          -
          <lpage>690</lpage>
          , (
          <year>1991</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4] Johan de Kleer, '
          <article-title>An assumption-based TMS'</article-title>
          ,
          <source>Artificial Intelligence</source>
          ,
          <volume>28</volume>
          ,
          <fpage>127</fpage>
          -
          <lpage>162</lpage>
          , (
          <year>1986</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] Johan de Kleer, '
          <article-title>Problem solving with the ATMS'</article-title>
          ,
          <source>Artificial Intelligence</source>
          ,
          <volume>28</volume>
          (
          <issue>2</issue>
          ),
          <fpage>197</fpage>
          -
          <lpage>224</lpage>
          , (
          <year>1986</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Marc</given-names>
            <surname>Denecker</surname>
          </string-name>
          and Antonis Kakas, '
          <article-title>Abduction in logic programming'</article-title>
          ,
          <source>in Computational Logic: Logic Programming and Beyond</source>
          ,
          <volume>402</volume>
          -
          <fpage>436</fpage>
          , Springer, (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Eiter</surname>
          </string-name>
          and Georg Gottlob, '
          <article-title>The complexity of logic-based abduction'</article-title>
          ,
          <source>Journal of the ACM (JACM)</source>
          ,
          <volume>42</volume>
          (
          <issue>1</issue>
          ),
          <fpage>3</fpage>
          -
          <lpage>42</lpage>
          , (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Eiter</surname>
          </string-name>
          , Kazuhisa Makino, and Georg Gottlob, '
          <article-title>Computational aspects of monotone dualization: A brief survey'</article-title>
          ,
          <source>Discrete Applied Mathematics</source>
          ,
          <volume>156</volume>
          (
          <issue>11</issue>
          ),
          <fpage>2035</fpage>
          -
          <lpage>2049</lpage>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Gerhard</given-names>
            <surname>Friedrich</surname>
          </string-name>
          , Georg Gottlob, and Wolfgang Nejdl, '
          <article-title>Hypothesis classification, abductive diagnosis and therapy'</article-title>
          ,
          <source>in Expert Systems in Engineering Principles and Applications</source>
          ,
          <volume>69</volume>
          -
          <fpage>78</fpage>
          , Springer, (
          <year>1990</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Russell</surname>
            <given-names>Greiner</given-names>
          </string-name>
          ,
          <article-title>Barbara A Smith,</article-title>
          and Ralph W Wilkerson,
          <article-title>'A correction to the algorithm in reiter's theory of diagnosis'</article-title>
          ,
          <source>Artificial Intelligence</source>
          ,
          <volume>41</volume>
          (
          <issue>1</issue>
          ),
          <fpage>79</fpage>
          -
          <lpage>88</lpage>
          , (
          <year>1989</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Mark</given-names>
            <surname>Hall</surname>
          </string-name>
          , Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Reutemann</surname>
          </string-name>
          , and Ian H Witten, '
          <article-title>The weka data mining software: an update'</article-title>
          ,
          <source>ACM SIGKDD explorations newsletter</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          , (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Peter</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Hawkins and Davis J. Woollons</surname>
          </string-name>
          , '
          <article-title>Failure modes and effects analysis of complex engineering systems using functional models'</article-title>
          ,
          <source>Artificial Intelligence in Engineering</source>
          ,
          <volume>12</volume>
          ,
          <fpage>375</fpage>
          -
          <lpage>397</lpage>
          , (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Frank</surname>
            <given-names>Hutter</given-names>
          </string-name>
          , Youssef Hamadi,
          <string-name>
            <surname>Holger H Hoos</surname>
          </string-name>
          , and Kevin LeytonBrown, '
          <article-title>Performance prediction and automated tuning of randomized and parametric algorithms', in Principles and Practice of Constraint Programming-CP</article-title>
          <year>2006</year>
          ,
          <volume>213</volume>
          -
          <fpage>228</fpage>
          , Springer, (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Frank</surname>
            <given-names>Hutter</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            <given-names>Xu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holger H Hoos</surname>
          </string-name>
          , and
          <string-name>
            <surname>Kevin</surname>
          </string-name>
          Leyton-Brown, '
          <article-title>Algorithm runtime prediction: Methods &amp; evaluation'</article-title>
          ,
          <source>Artificial Intelligence</source>
          ,
          <volume>206</volume>
          ,
          <fpage>79</fpage>
          -
          <lpage>111</lpage>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Antonis</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Kakas</surname>
          </string-name>
          , Robert A.
          <string-name>
            <surname>Kowalski</surname>
          </string-name>
          , and Francesca Toni, '
          <article-title>Abductive logic programming'</article-title>
          ,
          <source>Journal of logic and computation</source>
          ,
          <volume>2</volume>
          (
          <issue>6</issue>
          ),
          <fpage>719</fpage>
          -
          <lpage>770</lpage>
          , (
          <year>1992</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Richard</surname>
            <given-names>M Karp</given-names>
          </string-name>
          ,
          <article-title>Reducibility among combinatorial problems</article-title>
          , Springer,
          <year>1972</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Lars</surname>
            <given-names>Kotthoff</given-names>
          </string-name>
          , '
          <article-title>Algorithm selection for combinatorial search problems: A survey'</article-title>
          ,
          <source>arXiv preprint arXiv:1210.7959</source>
          , (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Matthieu</surname>
            <given-names>Latapy</given-names>
          </string-name>
          ,
          <source>Cle´mence Magnien, and Nathalie Del Vecchio</source>
          , '
          <article-title>Basic notions for the analysis of large two-mode networks', Social networks</article-title>
          ,
          <volume>30</volume>
          (
          <issue>1</issue>
          ),
          <fpage>31</fpage>
          -
          <lpage>48</lpage>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Hector</surname>
            <given-names>J Levesque,</given-names>
          </string-name>
          '
          <article-title>A knowledge-level account of abduction</article-title>
          .',
          <string-name>
            <surname>in</surname>
            <given-names>IJCAI</given-names>
          </string-name>
          , pp.
          <fpage>1061</fpage>
          -
          <lpage>1067</lpage>
          , (
          <year>1989</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Kevin</given-names>
            <surname>Leyton-Brown</surname>
          </string-name>
          , Eugene Nudelman, Galen Andrew,
          <string-name>
            <surname>Jim McFadden</surname>
          </string-name>
          , and Yoav Shoham, '
          <article-title>A portfolio approach to algorithm selection'</article-title>
          ,
          <string-name>
            <surname>in</surname>
            <given-names>IJCAI</given-names>
          </string-name>
          , volume
          <volume>1543</volume>
          , p.
          <year>2003</year>
          , (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Li</given-names>
            <surname>Lin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yunfei</given-names>
            <surname>Jiang</surname>
          </string-name>
          , '
          <article-title>The computation of hitting sets: Review and new algorithms'</article-title>
          ,
          <source>Information Processing Letters</source>
          ,
          <volume>86</volume>
          (
          <issue>4</issue>
          ),
          <fpage>177</fpage>
          -
          <lpage>184</lpage>
          , (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Pierre</surname>
            <given-names>Marquis</given-names>
          </string-name>
          , '
          <article-title>Consequence finding algorithms'</article-title>
          ,
          <source>in Handbook of Defeasible Reasoning and Uncertainty Management Systems</source>
          ,
          <volume>41</volume>
          -
          <fpage>145</fpage>
          , Springer, (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Sheila</surname>
            <given-names>A McIlraith</given-names>
          </string-name>
          , '
          <article-title>Logic-based abductive inference', Knowledge Systems Laboratory</article-title>
          ,
          <source>Technical Report KSL-98-19</source>
          , (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Morak</surname>
          </string-name>
          , Nysret Musliu, Reinhard Pichler, Stefan Ru¨mmele, and Stefan Woltran, '
          <article-title>Evaluating tree-decomposition based algorithms for answer set programming'</article-title>
          ,
          <source>in Learning and Intelligent Optimization</source>
          ,
          <fpage>130</fpage>
          -
          <lpage>144</lpage>
          , Springer, (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Abbe</given-names>
            <surname>Mowshowitz</surname>
          </string-name>
          and Matthias Dehmer, '
          <article-title>Entropy and the complexity of graphs revisited'</article-title>
          ,
          <source>Entropy</source>
          ,
          <volume>14</volume>
          (
          <issue>3</issue>
          ),
          <fpage>559</fpage>
          -
          <lpage>570</lpage>
          , (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Nysret</given-names>
            <surname>Musliu</surname>
          </string-name>
          and Martin Schwengerer, '
          <article-title>Algorithm selection for the graph coloring problem'</article-title>
          ,
          <source>in Learning and Intelligent Optimization</source>
          ,
          <fpage>389</fpage>
          -
          <lpage>403</lpage>
          , Springer, (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Gustav</given-names>
            <surname>Nordh</surname>
          </string-name>
          and Bruno Zanuttini, '
          <article-title>What makes propositional abduction tractable'</article-title>
          ,
          <source>Artificial Intelligence</source>
          ,
          <volume>172</volume>
          (
          <issue>10</issue>
          ),
          <fpage>1245</fpage>
          -
          <lpage>1284</lpage>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Mattias</surname>
            <given-names>Nyberg</given-names>
          </string-name>
          , '
          <article-title>A generalized minimal hitting-set algorithm to handle diagnosis with behavioral modes'</article-title>
          ,
          <source>Systems, Man and Cybernetics</source>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>Systems and Humans</article-title>
          , IEEE Transactions on,
          <volume>41</volume>
          (
          <issue>1</issue>
          ),
          <fpage>137</fpage>
          -
          <lpage>148</lpage>
          , (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Ekaterina</surname>
            <given-names>Ovchinnikova</given-names>
          </string-name>
          , Niloofar Montazeri, Theodore Alexandrov,
          <string-name>
            <surname>Jerry R Hobbs</surname>
          </string-name>
          ,
          <article-title>Michael C McCord,</article-title>
          and
          <string-name>
            <surname>Rutu</surname>
          </string-name>
          Mulkar-Mehta, '
          <article-title>Abductive reasoning with a large knowledge base for discourse processing'</article-title>
          , in Computing Meaning,
          <fpage>107</fpage>
          -
          <lpage>127</lpage>
          , Springer, (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Yun</given-names>
            <surname>Peng</surname>
          </string-name>
          and
          <article-title>James A Reggia, 'Plausibility of diagnostic hypotheses: The nature of simplicity</article-title>
          .',
          <string-name>
            <surname>in</surname>
            <given-names>AAAI</given-names>
          </string-name>
          , volume
          <volume>86</volume>
          , pp.
          <fpage>140</fpage>
          -
          <lpage>145</lpage>
          , (
          <year>1986</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Yun</given-names>
            <surname>Peng</surname>
          </string-name>
          and
          <article-title>James A Reggia, Abductive inference models for diagnostic problem-solving</article-title>
          , Springer,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>David</given-names>
            <surname>Poole</surname>
          </string-name>
          and
          <string-name>
            <given-names>Keiji</given-names>
            <surname>Kanazawa</surname>
          </string-name>
          , '
          <article-title>A decision-theoretic abductive basis for planning'</article-title>
          ,
          <source>in AAAI Spr</source>
          . Symp. on
          <string-name>
            <surname>Decision-Theoretic</surname>
            <given-names>Planning</given-names>
          </string-name>
          , (
          <year>1994</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Quaritsch</surname>
          </string-name>
          and Ingo Pill, '
          <article-title>Pymbd: A library of mbd algorithms and a light-weight evaluation platform'</article-title>
          ,
          <source>Proceedings of Dx2014</source>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Raymond</surname>
            <given-names>Reiter</given-names>
          </string-name>
          , '
          <article-title>A theory of diagnosis from first principles'</article-title>
          ,
          <source>Artificial Intelligence</source>
          ,
          <volume>32</volume>
          (
          <issue>1</issue>
          ),
          <fpage>57</fpage>
          -
          <lpage>95</lpage>
          , (
          <year>1987</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>John</surname>
            <given-names>R Rice,</given-names>
          </string-name>
          '
          <article-title>The algorithm selection problem', (</article-title>
          <year>1975</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Franz</surname>
            <given-names>Wotawa</given-names>
          </string-name>
          , '
          <article-title>A variant of reiter's hitting-set algorithm'</article-title>
          ,
          <source>Information Processing Letters</source>
          ,
          <volume>79</volume>
          (
          <issue>1</issue>
          ),
          <fpage>45</fpage>
          -
          <lpage>51</lpage>
          , (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Franz</surname>
            <given-names>Wotawa</given-names>
          </string-name>
          , '
          <article-title>Failure mode and effect analysis for abductive diagnosis'</article-title>
          ,
          <source>in Proceedings of the International Workshop on Defeasible and Ampliative Reasoning (DARe-14)</source>
          , volume
          <volume>1212</volume>
          .
          <source>CEUR Workshop Proceedings, ISSN 1613-0073</source>
          , (
          <year>2014</year>
          ). http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1212</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>Lin</surname>
            <given-names>Xu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frank</surname>
            <given-names>Hutter</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holger H Hoos</surname>
          </string-name>
          , and
          <string-name>
            <surname>Kevin</surname>
          </string-name>
          Leyton-Brown, '
          <article-title>Satzilla: portfolio-based algorithm selection for sat'</article-title>
          ,
          <source>Journal of Artificial Intelligence Research</source>
          ,
          <fpage>565</fpage>
          -
          <lpage>606</lpage>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Yifan</surname>
            <given-names>Yang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Jamal</given-names>
            <surname>Atif</surname>
          </string-name>
          , and Isabelle Bloch, '
          <article-title>Abductive reasoning using tableau methods for high-level image interpretation'</article-title>
          ,
          <source>in KI 2015: Advances in Artificial Intelligence</source>
          ,
          <fpage>356</fpage>
          -
          <lpage>365</lpage>
          , Springer, (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>