<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning with safety requirements: state of the art and open questions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Francesco Leofante</string-name>
          <email>francesco.leofante@edu.unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Pulina</string-name>
          <email>lpulina@uniss.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Armando Tacchella</string-name>
          <email>armando.tacchella@unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DIBRIS - Universit a`di Genova - Via all'Opera Pia</institution>
          ,
          <addr-line>13; 16145 Genova;</addr-line>
          <country country="IT">Italia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>POLCOMING - Universit a`di Sassari - Viale Mancini 5;</institution>
          <addr-line>07100 Sassari;</addr-line>
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>11</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>A standing challenge for the adoption of machine learning (ML) techniques in safety-related applications is that misbehaviors can be hardly ruled out with traditional analytical or probabilistic techniques. In previous contributions of ours, we have shown how to deal with safety requirements considering both Multi-Layer Perceptrons (MLPs) and Support Vector Machines (SVMs), two of the most effective and well-known ML techniques. The key to provide safety guarantees for MLPs and SVMs is to combine Satisfiability Modulo Theory (SMT) solvers with suitable abstraction techniques. The purpose of this paper is to provide an overview of problems and related solution techniques when it comes to guarantee that the input-output function of trained computational models will behave according to specifications. We also summarize experimental results showing what can be effectively assessed in practice using state-of-the-art SMT solvers. Our empirical results are the starting point to introduce some open questions that we believe should be considered in future research about learning with safety requirements.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Machine learning (ML) techniques are adopted in a wide range of research and
engineering domains. For instance, artificial agents that act in physical domains can be
equipped with learning capabilities in order to acquire the dynamics of interest, thus
saving developers from the burden of devising explicit models — see [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1–3</xref>
        ] for
examples. However, in spite of some exceptions, ML techniques are confined to systems
which comply to the lowest safety integrity levels, achievable with standard industrial
best practices [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The main reason is the absence of effective safety assurance
methods for systems using ML techniques, because traditional analytical and probabilistic
methods can be ineffective in safety-related applications — see, e.g., [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In practice,
guaranteeing safety amounts to controlling undesirable behaviors when the agent acts
based on a model of the environment. In some cases of interest, a further challenge is
related to the cost of sampling, i.e., acquiring new data for learning, so that the challenge
of ensuring safety, must also be coupled with the paucity of samples.
      </p>
      <p>
        In previous contributions of ours we have shown effective approaches to learning
with safety requirements. In particular, in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] we tackled the problem of ensuring that
a trained Multi-Layer Perceptron (MLP) emits outputs which are guaranteed to stay
within range of values considered to be safe. We called this global safety in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], where
we introduced two additional problems, namely local safety and stability. In the former
case, we wish to guarantee that an input close to a training input sample, will elicit an
output close to the corresponding training output sample. In the latter case, we wish to
ensure that if the input is contained within some bounded region, also the output will
be contained within a corresponding bounded set. Since effective learning with MLPs
requires a large number of samples and their empirical tuning is often fraught with
difficulties, in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] we considered safety requirements in conjunction with kernel-bases
methods. In order to reduce the number of samples we considered Active Learning
(AL) — see, e.g., [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] — and we focused on Support Vector Regression (SVR) — see,
e.g., [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for a tutorial introduction.
      </p>
      <p>
        In all the contributions above, the problem of providing safety guarantees is
encoded to formulas that can be solved automatically by Satisfiability Modulo Theory
(SMT) solvers [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Intuitively, verification of MLPs/SVRs via SMT involves
abstraction of the learned model into a set of constraints expressed in linear arithmetic over real
numbers. The abstraction process is guaranteed to be conservative, i.e., it provides an
over approximation of the concrete MLP/SVR. In this way, when the SMT solver proves
that the response of the abstract MLP/SVR cannot exceed a stated bound as long as the
input values satisfy given preconditions, we can certify that the concrete MLP/SVR has
the same property. On the other hand, if a counterexample is found, then it is either an
artefact of the abstraction, or a true counterexample proving that the MLP/SVR is not
satisfactory in terms of safety. The former case calls for a refinement of the abstraction,
whereas the second may entail further training of the MLP/SVR.
      </p>
      <p>
        The experimental assessment that we present considers SMT encodings obtained
from different verification problems. In the case of MLPs, the encodings are related to
control subsystems in the humanoids James [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and iCub [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In the case of James,
measuring external forces using a single force/torque sensor placed along the kinematic
chain of the arm requires compensation of the manipulator dynamic, i.e., the
contribution of internal forces must be subtracted from sensor readings [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. MLP-based
estimation of internal forces relies on angular positions and velocities of two shoulder
and two elbow joints, and takes into account components from gravity, Coriolis forces,
and manipulator inertia. In the case of iCub, we consider MLP-based extrapolation of
forward kinematics of its arms. In particular, we consider angular positions of three
shoulder, one elbow and three wrist joints as inputs, and end-effector positions as
outputs. Finally, in the case of SVRs, we considered an artificial physical domain simulated
with V-REP [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], wherewith we simulate the task of applying the right amount of force
to send a ball into a goal in the presence of obstacles. We further assume that acquiring
input samples is expensive, so that active learning instead of batch learning is required.
      </p>
      <p>
        The remainder of this paper is organized as follows. Section 2 gives background
notions on MLPs, SVRs, and SMT solvers. Section 3 summarizes the abstractions
introduced in [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6–8</xref>
        ] and introduces global, local and stability requirements. The state of
the art is presented in Section 4. Concluding remarks and open questions are presented
in Section 5.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <sec id="sec-2-1">
        <title>Neural networks for regression</title>
        <p>
          A Multi-Layer Perceptron (MLP) is a function ν : Rn → Rm obtained by feeding
n inputs through a network of neurons, i.e., computational units arranged in multiple
layers ending with m outputs, where each neuron is characterized by an activation
function σ : R → R. We consider networks with only three cascaded layers, namely
input, hidden, and output layer, so that there are always n neurons in the input layer, and
m neurons in the output layer. Let x = (x1, . . . , xn) be the input signal to the network
ν. We assume that input neurons compute the activation function σ(r) = r, i.e., their
task is simply to pass forward the components of the input signal. Let h denote the
number of hidden neurons, then the input of the j-th hidden neuron is
rj =
n
X ajixi + bj
i=1
j = {1, . . . , h}
where aji is the weight of the connection from the i-th neuron in the input layer (1 ≤
i ≤ n) to the j-th neuron in the hidden layer, and the constant bj is the bias of the j-th
neuron. We denote with A = {a11, a12, . . . a1n, . . . , ah1, ah2, . . . ahn} and with B =
{b1, . . . , bh} the corresponding sets of weights. The activation function σh computed
by hidden neurons is a non-constant, bounded and continuous function of its input.
According to [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] this guarantees that, in principle, ν will be capable of approximating
any continuous mapping f : Rn → Rm arbitrarily close. The input received by an
output neuron is
sk =
h
X ckj σh(rj ) + dk
j=1
k = {1, . . . , m}
where ckj denotes the weight of the connection from the j-th neuron in the hidden layer
to the k-th neuron in the output layer, while dk represents the bias of the k-th output
neuron – as before, C and D denote the corresponding sets of weights. The output
signal of the MLP is a vector ν(x) = (σo(s1), . . . , σo(sm)). The activation function of
output neurons can be chosen as σo = σh, so that each output of ν is constrained to the
range of σh, or as σo(r) = r, so that ν(x) = (s1, . . . , sm).
        </p>
        <p>The key aspect of MLPs is that the sets of parameters A, B, C, D are not manually
determined but are computed using an iterative optimization procedure known as
training. Let R be a training set comprised of t pairs R = {(x1, y1), . . . (xl, yl)} where for
each 1 ≤ i ≤ l, the vector xi is some input signal (pattern) and yi is the
corresponding output signal (label). If we assume that R is generated by some unknown function
f : Rn → Rm, then we can see training as a way of extrapolating the unknown function
f from the signals in the training set. The goal of training is to determine the parameters
A, B, C, D which maximize similarity between ν(x) and y for each pair (x, y) ∈ R.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Kernel-based methods for regression</title>
        <p>
          Support Vector Regression (SVR) is a supervised learning paradigm, whose
mathematical foundations are derived from Support Vector Machine (SVM) [
          <xref ref-type="bibr" rid="ref17 ref18 ref19">17–19</xref>
          ]. Suppose
(1)
(2)
we are given a set R = {(x1, y1) . . . (xl, yl)} ⊂ X × R, where X is the input space
with X = Rd. In the context of ε-SVR [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] for the linear case, our aim is to compute a
function ν(x) which, in case of ε-SVRs, takes the form
        </p>
        <p>
          ν(x) = w · x + b with w ∈ X , b ∈ R
such that ν(x) deviates at most ε from the targets yi ∈ T with kwk as small as possible.
A solution to this problem is presented in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], where slack variables ξi, ξi∗ are
introduced to handle otherwise unfeasible constraints. The formulation of the optimization
problem stated above then becomes:
(3)
(4)
(7)
minimize
subject to
The constant C &gt; 0 determines the trade-off between the “flatness” of f , i.e., how small
is kwk, and how much we want to tolerate deviations greater than ε.
        </p>
        <p>In order to solve the above stated optimization problem one can use the standard
dualization method using Lagrange multipliers. The Lagrange function writes
where ηi, ηi∗, αi, αi∗ are Lagrange multipliers. Solving the dual problem allows to rewrite
(3) as</p>
        <p>l
ν(x) = X(αi − αi∗)xi · x + b (6)</p>
        <p>i=1
The algorithm can be modified to handle non-linear cases. This could be done by
mapping the input samples xi into some feature space F by means of a map Φ : X → F .
However this approach can easily become computationally unfeasible for high
dimensionality of the input data. A better solution can be obtained by applying the so-called
kernel trick. Observing eq. (6) it is possible to see that the SVR algorithm only
depends on dot products between patterns xi. Hence it suffices to know the function
K(x, x0) = Φ(x) · Φ(x0) rather than Φ explicitly. By introducing this idea the
optimization problem can be rewritten, leading to the non-linear version of (6):
ν(x) =</p>
        <p>
          l
X(αi − αi∗)K(xi, x) + b
i=1
where K(·) is called kernel function as long as it fulfills some additional conditions —
see, e.g., [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] for a full characterization of K.
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Satisfiability Modulo Theory</title>
        <p>
          A Constraint Satisfaction Problem (CSP) [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] is defined over aconstraint network, i.e.,
a finite set ofvariables, each ranging over a given domain, and a finite set ofconstraints.
Intuitively, a constraint represents a combination of values that a certain subset of
variables is allowed to take. In this paper we are concerned with CSPs involving linear
arithmetic constraints over real-valued variables. From a syntactical point of view, the
constraint sets that we consider are built according to the following grammar:
set → { constraint }∗ constraint
constraint → ({ atom }∗ atom)
        </p>
        <p>
          atom → bound | equation
bound → value relop value
relop → &lt; | ≤ | = | ≥ | &gt;
value → var | - var | const
equation → var = value - value
| var = value + value
| var = const · value
where const is a real value, and var is the name of a (real-valued) variable. Formally, this
grammar defines a fragment of linear arithmetic over real numbers known as
QuantifierFree Linear Arithmetic over Reals (QF LRA) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>From a semantical point of view, let X be a set of real-valued variables, and μ be
an assignment of values to the variables in X, i.e., a partial function μ : X → R. For
all μ, we assume that μ(c) = c for every c ∈ R, and that relopμ, +μ, −μ, ·μ denote the
standard interpretations of relational and arithmetic operators over real numbers. Linear
constraints are thus interpreted as follows:
– A constraint (a1 . . . an) is true exactly when at least one of the atoms ai with i ∈
{1, . . . , n} is true.
– Given x, y ∈ X ∪ R, an atom x relop y is true exactly when μ(x) relopμ μ(y)
holds.
– Given x ∈ X, y, z ∈ X ∪ R, and c ∈ R
• x = y z with ∈ {+, −} is true exactly when μ(x) =μ μ(y) μ μ(z),
• x = c · y is true exactly when μ(x) =μ μ(c) ·μ μ(y),
Given an assignment, a constraint is thus a function that returns a Boolean value. A
constraint is satisfied if it is true under a given assignment, and it is violated otherwise. A
constraint network is satisfiable if it exists an assignment that satisfies all the constraints
in the network, and unsatisfiable otherwise.</p>
        <p>
          While different approaches are available to solve CSPs, this work is confined to
Satisfiability Modulo Theories (SMT) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. SMT is the problem of deciding the
satisfiability of a first-order formula with respect to some decidable theoryT . In particular,
SMT generalizes the Boolean satisfiability problem (SAT) by adding background
theories such as the theory of real numbers, the theory of integers, and the theories of data
structures (e.g., lists, arrays and bit vectors). The idea behind SMT is that the
satisfiability of a constraint network can be decided in two steps. The first one is to convert
each arithmetic constraint to a Boolean constraint, while the second one is to check
the consistency of the assignment in the corresponding background theory. Checking
that the Boolean assignment is feasible in the underlying mathematical theory can be
performed by a specialized reasoning procedure (SMT solvers) – any procedure that
computes feasibility of a set of linear inequalities in the case of QF LRA. If the
consistency check fails, then a new Boolean assignment is requested and the SMT solver goes
on until either an arithmetically consistent Boolean assignment is found, or no more
Boolean assignments can be found.
3
3.1
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>SMT-based abstraction and verification techniques</title>
      <sec id="sec-3-1">
        <title>Abstraction for MLPs and SVRs</title>
        <p>Research on learning with safety requirements has the objective of verifying trained
MLPs/SVRs, i.e., guarantee that, given some input preconditions, the mapping ν
respects stated output postconditions. In principle, this check can be automated in a
suitable formal logic L by proving</p>
        <p>
          |=L ∀x (π(x) → ϕ(ν(x)))
where π and ϕ are formulas in L expressing preconditions on input signals and
postconditions on output signals, respectively. Most often, activation functions for MLPs
are non-linear and transcendental and so are kernels for SVRs, which makes (8)
undecidable if L is chosen to be expressive enough to model such functions – see, e.g., [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ].
In [
          <xref ref-type="bibr" rid="ref6 ref8">6, 8</xref>
          ] this problem is tackled introducing an abstraction framework. Given ν and
properties π, ϕ, a corresponding abstract model ν˜ and abstract properties π˜, ϕ˜ are
defined so that
        </p>
        <p>|=L0 ∀x0 (π˜(x0) → ϕ˜(ν˜(x0)))
is decidable, and (8) holds whenever (9) holds. If this is not the case, then there is some
abstract input x0 such that π˜(x) is satisfied, but the abstract output ν˜(x0) violates ϕ˜.
There are two cases:
– If a concrete input x can be extracted from x0 such that also ν(x) violates ϕ, then
x0 is a realizable counterexample, and ν fails to uphold (8).
– On the other hand, if no concrete input x can be extracted such that ν(x) violates
ϕ, then x0 is a spurious counterexample.</p>
        <p>
          Spurious counterexamples arise because an abstract MLP/SVR ν˜ is “more permissive”
than its concrete counterpart ν. In this case, a refinement is needed, i.e., a new
abstraction “tighter” than ν˜ must be computed and checked. We call this
abstractionrefinement loop Counter-Example Triggered Abstraction Refinement (CETAR) in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
Briefly stated, the main difference between CETAR and classical Counter-Example
Guided Abstraction Refinement (CEGAR) is that, in the latter counterexamples are used
as a basis to refine the abstraction, whereas in the former counterexamples are merely
triggering the refinement, without being involved in computing the refinement. While
there is no theoretical guarantee that the CETAR loop terminates, in practice it is
expected that a network is declared safe or a realizable counterexample is found within a
small number of refinement steps.
        </p>
        <p>
          Details about the definition ofν˜ for MLPs in such a way that L0=QF LRA, as well as
details of preconditions and postconditions to be checked, with related abstractions, are
(8)
(9)
provided in [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ]. In the following, for the sake of explanation, we briefly summarize
how to encode SVR verification in a corresponding QF LRA problem as shown in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
In particular, our experimental analysis focuses on radial basis function (RBF) kernels
K(xi, x) = e−γkx−xik2 with γ =
Given a SVR, we consider its input domain to be defined asI = D1 × . . . × Dn, where
Di = [ai, bi] with ai, bi ∈ R, and the output domain to be defined as O = [c, d] with
c, d ∈ R. A concrete domain D = [a, b] is abstracted to [D] = {[x, y] | a ≤ x ≤ y ≤
b}, i.e., the set of intervals inside D, where [x] is a generic element.
        </p>
        <p>To abstract the RBF kernel K, we observe that (10) corresponds to a Gaussian
distribution G(μ, σ) with mean μ = xi — the i-th support vector of ς — and variance
σ2 = 21γ . Given a real-valued abstraction parameter p, let us consider the interval [x, x+
p]. It is possible to show that the maximum and minimum of G(μ, σ) restricted to [x, x+
p] lie in the external points of such interval, unless μ ∈ [x, x + p] in which case the
maximum of G lies in x = μ. The abstraction K˜p is complete once we consider two
points x0, x1 with x0 &lt; μ &lt; x1 defined as the points in which G(xi) G(μ) with
i ∈ 0, 1, where by “ a b” we mean that a is at least one order of magnitude smaller
than b 3. We define theabstract kernel function for the i-th support vector as
[min(G(x), G(x + p)), G(μ)] if μ ∈ [x, x + p] ⊂ [x0, x1]
K˜p([x, x+p]) = [min(G(x), G(x + p)), max(G(x), G(x + p))] if μ ∈/ [x, x + p] ⊂ [x0, x1]
[0, G(x0)] otherwise
(11)
According to eq. (11), p controls to what extent K˜p over-approximates K since large
values of p correspond to coarse-grained abstractions and small values of p to
finegrained ones. A pictorial representation of eq. (11) is given in figure 1. The abstract
3 Given a Gaussian distribution, 99.7% of its samples lie within ±3σ. In this case however, our
RBF formulation does not include any normalization factor and stopping at ±3σ would not
be sufficient to include all relevant samples. The heuristics described in the text was therefore
adopted to solve the problem.
SVR ς˜p is defined rewriting equation (7) as
(12)
(13)
(14)
(15)
ς˜p =</p>
        <p>l
X(αi − αi∗)K˜p(xi, x) + b</p>
        <p>i
where we abuse notation and use conventional arithmetic symbols to denote their
interval arithmetic counterparts.</p>
        <p>
          It is easy to see that every abstract SVR ς˜p can be modeled in QF LRA. Let us
consider a concrete SVR ς : [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] → (0, 1), i.e., a SVR trained on T = {(x1, y1) . . . (xl, yl)}
with xi ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] ⊂ R and yi ∈ (0, 1) ⊂ R. The abstract input domain is defined by:
xi ≥ 0
xi ≤ 1
The coefficients and the intercept seen in equation (12) carry over toς˜p and are defined
by constraints of the form
αi − αi∗ = hαi − αi∗i
b = hbi
where the notation h·i represents the actual values obtained by training ς. To complete
the encoding, we must provide constraints that correspond to (11) for all the support
vectors of ς. Assuming p = 0.5, the encoding for the i-th kernel writes
if (x ≤ x0)
        </p>
        <p>then (Ki ≥ 0) (Ki ≤ G(x0))
. . .
if (x1 &lt; x)</p>
        <p>then (Ki ≥ 0) (Ki ≤ G(x0))
. . .
if (0 &lt; x)(x ≤ 0.5)</p>
        <p>then (Ki ≥ min(G(0), G(0.5)) (Ki ≤ max(G(0), G(0.5))</p>
        <p>The expression “ if t1 . . . tm then a1 . . . an” is an abbreviation for the set of
constraints
where t1, . . . , tm and a1, . . . an are atoms expressing bound on variables, and ¬ is
Boolean negation (e.g., ¬(x &lt; y) is (x ≥ y)). Finally, the output of the SVR is just
f (x) = hα1 − α1∗iKi(x1, x) + . . . + hαn − αn∗iKn(xn, x) + hbi
(16)
with x1, . . . , xn being the support vectors of ς
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Problems in safety related learning</title>
        <p>
          As mentioned previously, verification of MLPs and SVRs amounts to answer a
logical query. In this subsection, we summarize the three different verification problems –
namely, global safety, local safety, and stability – as introduced in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], as well as the
different ways of defining preconditionsπ and postconditions ϕ.
Global safety This property was considered in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] for the first time, and requires that a
trained model ν : I → O with n inputs and m outputs fulfills the condition
∀x ∈ I, ∀k ∈ {1, . . . , m} : νk(x) ∈ [lk, hk]
(17)
where lk, hk ∈ Ek are constants chosen to define an interval wherein thek-th
component of ν(x) is to range – we call them safety thresholds after [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. It is easy to see how
(17) can be expressed using the notation of (8) by defining
– π(x) as a predicate which is true exactly when x ∈ I, and
– ϕ(y) as a predicate which is true exactly when y ∈ B where B = [l1, h1] × . . . ×
[lm, hm].
        </p>
        <p>We say that an MLP/SVR ν is globally safe if condition (8) holds when π and ϕ are
defined as above. This notion of (global) safety is representative of all the cases in
which an out-of-range response is unacceptable, such as, e.g., minimum and maximum
reach of an industrial manipulator, lowest and highest percentage of a component in
a mixture, and minimum and maximum dose of a drug that can be administered to a
patient.</p>
        <p>Local safety The need to check for global safety in MLPs can be mitigated using
bounded activation functions in the output neurons to “squash” their response within
an acceptable range, modulo rescaling. In principle, the same trick could be applied to
SVRs as well, even if it is much less commonly adopted. A more stringent, yet
necessary, requirement is thus represented by “local” safety. Informally speaking, we can say
that an MLP/SVR ν trained on a dataset R of t patterns is “locally safe” whenever given
an input pattern x∗ 6= x for all (x, y) ∈ R it turns out that ν(x∗) is “close” to yj as long
as x∗ is “close” to xj and (xj , yj ) ∈ R for some j ∈ {1, . . . , t}. Local safety cannot be
guaranteed by design, because the range of acceptable values varies from point to point.
Moreover, it is relevant in all those applications where the error of an MLP/SVR must
never exceeds a given bound on yet-to-be-seen inputs, and where the response of an
MLP must be stable with respect to small perturbations in its input in the neighborhood
of “known” patterns.</p>
        <p>Stability While local safety allows to analyze the response of a network even in the
presence of bounded-output MLPs/SVRs, it suffers from two important limitations.
– It does not take into account the whole input domain, i.e., if a pattern falls out
the union of the input polytopes, then nothing can be said about the quality of the
response.
– It is tied to a specific training set, i.e., changing the set of patterns used to train the
network can change the response of the safety check.</p>
        <p>These two limitations are important in practice, in particular when the availability of
training data is scarce so the chance of hitting “unseen” patterns is high, or when the
training set is very sparse so the chance of having large regions of uncovered input
space is high. One way to overcome these limitations is to keep the locality of the
check, i.e., small input variations must correspond to small output variations, but drop</p>
        <p>Family
ICUB LOGI
ICUB RAMP
ICUB TANH
JAMES LOGI
JAMES RAMP
JAMES TANH</p>
        <p>N
Overall</p>
        <p>#
the dependency from the training set, i.e., check the whole input space. Formally, given
a trained model ν : I → O we definestability as follows
where δ, ∈ R+ are two arbitrary constants which define the amount of variation that
we allow in the input and the corresponding perturbation of the output. We observe
that condition (18) requires non linear arithmetic to express the scalar product of the
difference vectors. In other words, once fixed x1, we have that x2 must be contained
in an hypersphere of radius δ centered in x1 – and similarly for ν(x1), ν(x2) and .
An overapproximation of such hypersphere, and thus a consistent abstraction of (18), is
any convex polyhedra in which the hypersphere is contained. The degree of abstraction
can be controlled by setting the number of facets of the polyhedra, and such an abstract
condition can be expressed in QF LRA.</p>
        <p>
          As reported in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] stability turns out to be the hardest condition to check for SMT
solvers, so in the following we will focus on this one.
4
4.1
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>State of the art</title>
      <sec id="sec-4-1">
        <title>Experiments with batch learning and MLPs</title>
        <p>
          Considering the experimental results reported in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], stability turned out to be the
hardest condition to check for SMT solvers, and in Table 1 we report a summary of that
experimental analysis. We consider both James and iCub case studies in the case of
MLPs with different activation functions, namely logistic (logi), bounded ramp (ramp),
and hyperbolic tangent (tanh) functions. In order to avoid confining the results to a
single SMT solver, the number of encodings solved and the cumulative time taken for each
family is computed considering the “SOTA solver”, i.e., the ideal SMT solver that
always fares the best time among all the solvers considered. An encoding is thus solved
if at least one of the solvers solves it, and the time taken is the best among CPU times
of the solvers that solved the encoding. The solvers involved in the experimentation
were the participants to the QF LRA track of the SMT Competition 2011 [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], namely
CVC [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] (version 4 1.0rc.2026), MATHSAT [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] (version 5), SMTINTERPOL [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
(version 2.0pre), YICES [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] (v. 1.0.29, with smtlib2 parser), and Z3 [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] (v. 3.0). In Table 1,
the encodings are classified according to their hardness with the following criteria: easy
encodings are those solved by all the solvers, medium encodings are those non easy
encodings that could still be solved by at least two solvers, medium-hard encodings are
those solved by one reasoner only, and hard encodings are those that remained unsolved.
Looking at the results, we observe that 2202 encodings out of 2952 were solved. With
“only” 75% of the encodings solved, we confirm that this suite contains the most
difficult formulas of the whole test set. In particular, 615 encodings are easy, 1418 medium,
while 169 are medium-hard. For more details about the results, we refer the reader
to [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Experiments with active learning and SVRs</title>
        <p>
          Experimental Setup A pictorial representation of the case study we consider is shown
in Figure 2. In this domain setup, a ball has to be thrown in such a way that it goes
through a goal, possibly avoiding obstacles present in the environment. The learning
agent interacting with the domain is allowed to control the force f = (fx, fy ) applied
to the ball when it is thrown. Given that the coordinate origin is placed at the
center of the wall opposing the goal — see Figure 2 — we chose fx ∈ [
          <xref ref-type="bibr" rid="ref20">−20, 20</xref>
          ]N and
fy ∈ [
          <xref ref-type="bibr" rid="ref10 ref20">10, 20</xref>
          ]N , respectively. The range for fx is chosen in such a way that the whole
field, including side walls, can be targeted, and the range forfy is chosen so that the ball
can arrive at the goal winning pavement’s friction both in straight and kick shots. The
collision between the side walls and the ball is completely elastic. In principle, the
obstacle can be placed everywhere in the field, but we consider only three configurations,
namely (a) no obstacle is present, (b) the obstacle occludes straight line trajectories
— the case depicted in Figure 2 — and (c) the obstacle partially occludes the goal, so
that not all straight shots may go through it. The domain of Figure 2 is implemented
Algorithm
Random
QBC
        </p>
        <p>DB
Random
QBC</p>
        <p>
          DB
Random
QBC
DB
using the V-REP simulator [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Concrete SVRs are trained using the Python library
SCIKIT-LEARN, which in turn is based on LIBSVM [29]. Verification of abstract SVRs
is carried out using Mathsat [30]. The Python library GPy [31] is used to train Gaussian
process regression.
        </p>
        <p>Results We train SVRs on a dataset of n samples built as T = {(fi, yi), . . . , (fn, yn)}
where fi = (fxi, fyi) is the force vector applied to the ball to throw it, and yi is the
observed distance from the center of the gate, which is also the target of the regression.
Obviously, the regression problem could be extended, e.g., by including the initial
position of the ball, but our focus here is on getting the simplest example that could lead to
evaluation of different active learning paradigms and verification thereof. Preliminary
tests were performed in an obstacle-free environment, using specific subsets of all the
feasible trajectories. Three algorithms were compared, namely a random strategy for
adding new samples to the learning set, the query by committee algorithm and the one
proposed by Demir and Bruzzone [32] — listed in Table 2 as “Random”, “QBC” and
“DB”, respectively. Each test episode consists in adding samples to the model, and it
ends when either the Root Mean Squared Error (RMSE) is smaller than a threshold
(here 0.2), or when at most 100 additional samples are asked for. Table 2 suggests that
when the learning problem involves one single class of shots (e.g., only straight shots),
the algorithms provide better results. Indeed, when both straight and kick shots are
possible, all three algorithms require more than 100 additional samples in at least 50% of
the runs in order to reach the required RMSE. We conducted further experiments, this
time considering the three cases (a), (b), and (c) described before, and the full set of
feasible trajectories in each case. The results obtained, shown in Table 2 (right),
confirm the observation made on Table 2 (left). The best results are indeed obtained for
case (b), wherein the obstacle is placed in such a way that only kick shots are feasible,
thus turning the learning problem into a single-concept one.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Open questions</title>
      <p>In this work we have summarized the state of the art in the topic of learning with safety
requirements. The promise of this research is to make data-driven models fit enough
to be deployed in safety-related context, e.g., adaptive agents operating in physical
domains. However, several questions should be answered before the techniques surveyed
in this paper can be pushed in practical applications with confidence in their
effectiveness. Currently, we can consider the following issues:
– SMT solvers could deal with the majority of encodings in a reasonable amount of
time; still, there might be room for improvement and scalability with respect to the
size of the learning problems must be assessed.
– Safety requirements have been considered for batch learning with MLPs and active
learning with SVRs; it would be worth investigating whether active learning with
MLPs and batch learning with SVRs can give improved results in terms of learning
and verification.
– There are learning techniques which are both very effective and natively provide
(statistical) bounds on their prediction; Gaussian process regression (GPR) [33] is
one such technique, with a fairly considerable record of success stories; it would
be interesting to compare the level of confidence in the prediction given by GPR
on one side, and the strict guarantees provided by MLPs/SVRs checked with SMT
solvers on the other.
– Applications in which learning techniques ought to provide safety requirements are
diverse, and include learning through reinforcement with continuous rather than
discrete state-action-space representations — see [34] for a discussion. One
important issue is thus the feasibility of techniques that we surveyed in such fields,
wherein learning is part of a larger system devoted to program adaptive agents.
Our future works will focus on addressing the above open issues and whether the
proposed methodologies can scale to real-world problems, possibly posing verification of
learning models as a challenge problem for the SMT research community.
29. Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Transactions
on Intelligent Systems and Technology (TIST) 2(3) (2011) 27
30. Cimatti, A., Griggio, A., Schaafsma, B., Sebastiani, R.: The MathSAT5 SMT Solver. In
Piterman, N., Smolka, S., eds.: Proceedings of TACAS. Volume 7795 of LNCS, Springer
(2013)
31. GPy: GPy: A gaussian process framework in python. http://github.com/</p>
      <p>SheffieldML/GPy (since 2012)
32. Demir, B., Bruzzone, L.: A multiple criteria active learning method for support vector
regression. Pattern Recognition 47(7) (2014) 2558–2567
33. Williams, C.K.I., Rasmussen, C.E.: Gaussian processes for regression. In: Advances in
Neural Information Processing Systems 8, NIPS, Denver, CO, November 27-30, 1995. (1995)
514–520
34. Pathak, S., Pulina, L., Metta, G., Tacchella, A.: How to abstract intelligence?(if verification
is in order). In: Proc. 2013 AAAI Fall Symp. How Should Intelligence Be Abstracted in AI
Research. (2013)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Argall</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chernova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veloso</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Browning</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A survey of robot learning from demonstration</article-title>
          .
          <source>Robotics and autonomous systems 57(5)</source>
          (
          <year>2009</year>
          )
          <fpage>469</fpage>
          -
          <lpage>483</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Nguyen-Tuong</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
          </string-name>
          , J.:
          <article-title>Model learning for robot control: a survey</article-title>
          .
          <source>Cognitive processing 12</source>
          (
          <issue>4</issue>
          ) (
          <year>2011</year>
          )
          <fpage>319</fpage>
          -
          <lpage>340</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kober</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
          </string-name>
          , J.:
          <article-title>Reinforcement learning in robotics: A survey</article-title>
          .
          <source>In: Reinforcement Learning</source>
          . Springer (
          <year>2012</year>
          )
          <fpage>579</fpage>
          -
          <lpage>610</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Functional Safety - A Straightforward Guide to applying IEC 61505 and Related Standards (2nd Edition)</article-title>
          .
          <source>Elsevier</source>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kurd</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Austin</surname>
          </string-name>
          , J.:
          <article-title>Developing artificial neural networks for safety critical systems</article-title>
          .
          <source>Neural Computing &amp; Applications</source>
          <volume>16</volume>
          (
          <issue>1</issue>
          ) (
          <year>2007</year>
          )
          <fpage>11</fpage>
          -
          <lpage>19</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pulina</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tacchella</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>An Abstraction-Refinement Approach to Verification of Artificial Neural Networks</article-title>
          .
          <source>In: 22nd International Conference on Computer Aided Verification (CAV 2010). Volume 6174 of Lecture Notes in Computer Science</source>
          , Springer (
          <year>2010</year>
          )
          <fpage>243</fpage>
          -
          <lpage>257</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Pulina</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tacchella</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Challenging smt solvers to verify neural networks</article-title>
          .
          <source>AI Communications</source>
          <volume>25</volume>
          (
          <issue>2</issue>
          ) (
          <year>2012</year>
          )
          <fpage>117</fpage>
          -
          <lpage>135</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Leofante</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tacchella</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Learning in Physical Domains: Mating Safety Requirements and Costly Sampling</article-title>
          .
          <source>In: AI*IA 2016, Advances in Artificial Intelligence - 15th International Conference of the Italian Association for Artificial Intelligence</source>
          , Genova, Italy,
          <source>November 29 - December 1</source>
          ,
          <year>2016</year>
          , Proceedings. (
          <year>2016</year>
          ) To appear.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Settles</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Active learning literature survey</article-title>
          . University of Wisconsin,
          <source>Madison</source>
          <volume>52</volume>
          (
          <fpage>55</fpage>
          -
          <lpage>66</lpage>
          ) (
          <year>2010</year>
          )
          <fpage>11</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Smola</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          , Scho¨lkopf, B.:
          <article-title>A tutorial on support vector regression</article-title>
          .
          <source>Statistics and computing 14(3)</source>
          (
          <year>2004</year>
          )
          <fpage>199</fpage>
          -
          <lpage>222</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Barrett</surname>
            ,
            <given-names>C.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seshia</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tinelli</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Satisfiability modulo theories</article-title>
          .
          <source>Handbook of satisfiability185</source>
          (
          <year>2009</year>
          )
          <fpage>825</fpage>
          -
          <lpage>885</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Jamone</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Metta</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nori</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sandini</surname>
          </string-name>
          , G.:
          <article-title>James: A humanoid robot acting over an unstructured world</article-title>
          .
          <source>In: Humanoid Robots</source>
          ,
          <year>2006</year>
          6th IEEE-RAS International Conference on,
          <source>IEEE</source>
          (
          <year>2007</year>
          )
          <fpage>143</fpage>
          -
          <lpage>150</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Metta</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Natale</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nori</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sandini</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vernon</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fadiga</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Von Hofsten</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosander</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Santos-Victor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.:
          <article-title>The icub humanoid robot: An open-systems platform for research in cognitive development</article-title>
          .
          <source>Neural Networks</source>
          <volume>23</volume>
          (
          <issue>8</issue>
          ) (
          <year>2010</year>
          )
          <fpage>1125</fpage>
          -
          <lpage>1134</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Fumagalli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gijsberts</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ivaldi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jamone</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Metta</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Natale</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nori</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sandini</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Learning to Exploit Proximal Force Sensing: a Comparison Approach. From Motor Learning to Interaction Learning in Robots (</article-title>
          <year>2010</year>
          )
          <fpage>149</fpage>
          -
          <lpage>167</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rohmer</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Freese</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>V-rep: A versatile and scalable robot simulation framework</article-title>
          .
          <source>In: Intelligent Robots and Systems (IROS)</source>
          ,
          <year>2013</year>
          IEEE/RSJ International Conference on,
          <source>IEEE</source>
          (
          <year>2013</year>
          )
          <fpage>1321</fpage>
          -
          <lpage>1326</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Hornik</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Approximation capabilities of multilayer feedforward networks</article-title>
          .
          <source>Neural Networks</source>
          <volume>4</volume>
          (
          <issue>2</issue>
          ) (
          <year>1991</year>
          )
          <fpage>251</fpage>
          -
          <lpage>257</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Boser</surname>
            ,
            <given-names>B.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guyon</surname>
            ,
            <given-names>I.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.N.:</given-names>
          </string-name>
          <article-title>A training algorithm for optimal margin classifiers</article-title>
          .
          <source>In: Proceedings of the fifth annual workshop on Computational learning theory, ACM</source>
          (
          <year>1992</year>
          )
          <fpage>144</fpage>
          -
          <lpage>152</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Guyon</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boser</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Automatic capacity tuning of very large vc-dimension classifiers</article-title>
          .
          <source>Advances in neural information processing systems</source>
          (
          <year>1993</year>
          )
          <fpage>147</fpage>
          -
          <lpage>147</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Cortes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Support-vector networks</article-title>
          .
          <source>Machine learning 20(3)</source>
          (
          <year>1995</year>
          )
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.N.:</given-names>
          </string-name>
          <article-title>The Nature of Statistical Learning Theory</article-title>
          . Springer-Verlag New York, Inc., New York, NY, USA (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Mackworth</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          :
          <article-title>Consistency in networks of relations</article-title>
          .
          <source>Artificial intelligence8(1)</source>
          (
          <year>1977</year>
          )
          <fpage>99</fpage>
          -
          <lpage>118</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Franzle</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herde</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teige</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ratschan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schubert</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Efficient solving of large nonlinear arithmetic constraint systems with complex boolean structure</article-title>
          .
          <source>Journal on Satisfiability, Boolean Modeling and Computation</source>
          <volume>1</volume>
          (
          <year>2007</year>
          )
          <fpage>209</fpage>
          -
          <lpage>236</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Bruttomesso</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deters</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , ,
          <string-name>
            <surname>Griggio</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The 2011 smt competition (</article-title>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Barrett</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tinelli</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Cvc3. In: International Conference on Computer Aided Verification, Springer (
          <year>2007</year>
          )
          <fpage>298</fpage>
          -
          <lpage>302</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Cimatti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Griggio</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaafsma</surname>
            ,
            <given-names>B.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>The mathsat5 smt solver</article-title>
          .
          <source>In: Tools and Algorithms for the Construction and Analysis of Systems</source>
          . Springer (
          <year>2013</year>
          )
          <fpage>93</fpage>
          -
          <lpage>107</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Christ</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoenicke</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nutz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Smtinterpol: An interpolating smt solver</article-title>
          .
          <source>In: International SPIN Workshop on Model Checking of Software</source>
          , Springer (
          <year>2012</year>
          )
          <fpage>248</fpage>
          -
          <lpage>254</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Dutertre</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Moura</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>A fast linear-arithmetic solver for dpll (t)</article-title>
          . In: International Conference on Computer Aided Verification, Springer (
          <year>2006</year>
          )
          <fpage>81</fpage>
          -
          <lpage>94</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>De Moura</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bjørner</surname>
          </string-name>
          , N.:
          <article-title>Z3: An efficient smt solver</article-title>
          .
          <source>In: International conference on Tools and Algorithms for the Construction and Analysis of Systems</source>
          , Springer (
          <year>2008</year>
          )
          <fpage>337</fpage>
          -
          <lpage>340</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>