<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Glass Box Approach: Verifying Contextual Adherence to Values</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrea Aler Tubella</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Umea˚ University</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Artificial Intelligence (AI) applications are being used to predict and assess behaviour in multiple domains, such as criminal justice and consumer finance, which directly affect human well-being. However, if AI is to be deployed safely, then people need to understand how the system is interpreting and whether it is adhering to the relevant moral values. Even though transparency is often seen as the requirement in this case, realistically it might not always be possible or desirable, whereas the need to ensure that the system operates within set moral bounds remains. Contact Author</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In this paper, we present an approach to evaluate the
moral bounds of an AI system based on the
monitoring of its inputs and outputs. We place a ‘Glass
Box’ around the system by mapping moral values
into contextual verifiable norms that constrain
inputs and outputs, in such a way that if these remain
within the box we can guarantee that the system
adheres to the value(s) in a specific context. The focus
on inputs and outputs allows for the verification and
comparison of vastly different intelligent systems–
from deep neural networks to agent-based systems–
whereas by making the context explicit we expose
the different perspectives and frameworks that are
taken into account when subsuming moral values
into specific norms and functionalities. We present
a modal logic formalisation of the Glass Box
approach which is domain-agnostic, implementable,
and expandable.
Artificial Intelligence (AI) has the potential to greatly
improve our autonomy and wellbeing, but to be able to
interact with it effectively and safely, we need to be able to trust
it. Trust in Artificial Intelligence (AI) is often linked to
algorithmic transparency [Theodorou et al., 2017]. This concept
includes more than just ensuring algorithm visibility: the
different factors that influence the decisions made by algorithms
should be visible to the people who use, regulate, and are
impacted by systems that employ those algorithms [Lepri et al.,
2018]. However, decisions made by predictive algorithms can
be opaque because of many factors, for instance IP
protection, which may not always be possible or desirable to
eliminate [Ananny and Crawford, 2018]. Yet, accidents, misuse,
disuse, and malicious use are all bound to happen. Since
human decisions can also be quite opaque, as are the decisions
made by corporations and organisations, mechanisms such as
audits, contracts, and monitoring are in place to regulate and
ensure attribution of accountability. In this paper, we propose
a similar approach to monitor and verify artificial systems.</p>
      <p>On the other hand, the current emphasis on the delivery
of high-level statements on AI ethics may also bring with it
the risk of implicitly setting the ‘moral background’ for
conversation about ethics and technology as being about abstract
principles [Greene et al., 2019]. The high-level values and
principles are dependent on the socio-cultural context [Turiel,
2002]; they are often only implicit in deliberation processes.
The shift from abstract to concrete therefore necessarily
involves careful consideration of the context. In this sense,
the subsumption of each value into functionalities will vary
from context to context the same way it can vary from
system to system. For example, consider the value fairness: it
can have different normative interpretations, e.g. equal
access to resources or equal opportunities, which can lead to
different actions. This decision may be informed by domain
requirements and regulations, e.g.national law. Often, these
choices made by the designer of the system and the contexts
considered are hidden from the end-user, as well as for future
developers and auditors: our aim is to make them explicit.</p>
      <p>This paper presents the Glass Box approach [Aler Tubella
et al., 2019] to evaluating and verifying the contextual
adherence of an intelligent system to moral values. We place
a ‘Glass Box’ around the system by mapping abstract values
into explicit verifiable norms that constrain inputs and
outputs, in such a way that if these remain within the box we
can guarantee that the system adheres to the value in a certain
context. The focus on inputs and outputs allows for the
verification and comparison of vastly different intelligent systems;
from deep neural networks to agent-based systems.
Furthermore, we make context explicit, exposing the different
perspectives and frameworks that are taken into account when
subsuming moral values into specific norms and
functionalities. We present a modal logic formalisation of the Glass
Box approach which is domain-agnostic, implementable, and
expandable.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The Glass Box approach</title>
      <p>The Glass Box approach [Aler Tubella et al., 2019], as
depicted in Figure 1, consists of two phases which inform each
other: interpretation and observation. It takes into account
the contextual interpretations of abstract principles by taking
a Design for Values perspective [Van de Poel, 2013].</p>
      <p>The interpretation stage is the explicit and structured
process of translating values into specific design requirements. It
entails a translation from abstract values into concrete norms
comprehensive enough so that fulfilling the norm will be
considered as adhering to the value. Following a Design for
Values approach, the shift from abstract to concrete
necessarily involves careful consideration of the context. For each
context we build an abstract-to-concrete hierarchy of norms
where the highest level is made-up of values and the lowest
level is composed of fine-grained concrete requirements for
the intelligent system only related to its inputs and outputs.
The intermediate levels are composed of progressively more
abstract norms, and the connections between nodes on each
level are contextual. When building an intelligent system,
each requirement is distilled into functionalities implemented
into the system in order to fulfill it. At the end of the
interpretation stage we therefore have an explicit contextual
hierarchy which can be used to provide high-level transparency
for a deployed system: depending on which requirements are
being fulfilled, we can provide explanations for how and
exactly in which context the system adheres to a value. Note
that the interpretation stage is also useful for the evaluation
of a system, as it provides grounding an justification for
system requirements, in terms of the norms and values they are
an interpretation. That is, it indicates a ‘for-the-sake-of’
relation between requirements and values.</p>
      <p>The low-level requirements inform the observation stage
of our approach, as they indicate what must be verified and
checked. In the observation stage, the behaviour of the
system is evaluated with respect to each value by studying its
compliance with the requirements identified in the previous
stage. In [Va´zquez-Salceda et al., 2007] two properties for
norms to be enforceable are identified: (1) verifiability i.e.,
the low-level norms must allow for being machine-verified
given the time and resources needed, and (2) computational
tractability, i.e. whether the functionalities comply with the
norms can be checked on any moment in a fast, low cost way.
Note that this is a requirement for the observation stage and
not necessarily for the design stage: some of the norms
chosen for the design stage might be easily implementable, but
hard to monitor. In the observation stage, to each requirement
identified in the interpretation stage, we assign one or
several tests to verify whether it is being fulfilled. Testing may
range through a variety of techniques, from simply checking
whether input/output verify a particular relationship, to
complex methods such as statistical testing, formal verification or
model-checking. These must be performed without
knowledge about the internal workings of the system under
observation, by monitoring input and output streams only. We
insist on this feature as we do not always have access to the
internals of the system, neither do we always have access to
the designs of a system.</p>
      <p>Designing the tests is naturally one of the most complex
steps of this process: the main challenge is the computational
tractability of these checks and their correspondence with the
low-level norms and their implementation. Different levels of
granularity in the norms pose different constraints for testing:
the cost of checking that the outcome for a certain input
remains within certain bounds is very different than having to
consider data of a whole database of inputs and outputs. Part
of the challenge is then determining the required granularity
of the Glass Box and testing: a too rough approximation can
possibly cap many potentially compliant behaviours, whereas
a too specific one may limit the adaptation of the AI system.</p>
      <p>From the observation stage we give feedback to the
interpretation stage: the testing informs us on which
requirements are being fulfilled and which aren’t, which may prompt
changes in the implementation or in the chosen requirements.
The observation stage is therefore fundamental both at a
design stage to verify that the intelligent system is functioning
as desired, and after deployment to explicitly fill in
stakeholders on how the system is interpreting and whether it is
verifiably adhering to the relevant moral values without having to
reveal its internal working.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Running example</title>
      <p>As an example, we will consider an intelligent system used
to filter CVs as a recruitment tool. Note that the ethical
values, norms and functionalities highlighted in what follows are
used purely as an example, and we do not claim that they
are the most appropriate to adhere to, but rather are used to
demonstrate the approach.</p>
      <p>As a starting point, the designers of the system must
identify the relevant ethical values that they wish to adhere to,
depending on the legal framework, the company policies, the
standards they are following, etc. They could, for example,
settle on fairness and privacy. The next step is to unravel what
these values mean in terms of recruitment decisions from
different perspectives.</p>
      <p>In the case of fairness, they could consider several angles.
In the context of the Swedish law, for instance, fairness in
recruitment means, amongst other things, non-discrimination
between male and female applicants. A design requirement
to guarantee fairness in this context can therefore be that the
ratio of acceptances vs rejections has to be the same for both
men and women (which can be calculated purely from the
inputs and outputs of the system). This requirement is then
taken into account for implementation: for example, it can
be decided –rather ineffectively [Reuters, 2018]– to exclude
gender from the inputs of the system. In the same way, each
legal requirement in terms of fairness will be translated into
specific requirements for the system.</p>
      <p>Another perspective for fairness can be provided by
company policy. It can for example be deemed that it is fair to
give preference to those applicants that are already working
for the company. In this case, the requirement for the
sysinterpretation
stage
values
norms
requirements
input system output
observation
stage
tem would be that applicants from within the company are
prioritised. Functionality-wise, this can be translated into the
assignment of weights for each variable considered in the
implementation.</p>
      <p>In the same way, other perspectives (e.g. European law,
HR recruitment guidelines) and other values (e.g. privacy,
responsibility) will be taken into account and distilled into
requirements and functionalities, providing a contextual
hierarchy of values, norms, requirements and functionalities as a
result of the interpretation stage.</p>
      <p>At this point, we proceed to the observation stage where
testing procedures are devised for each of the
functionalities identified in the previous stage. To test whether the
ratio of acceptances vs rejections is the same for both men and
women, we can for example check periodically after every
100 decisions whether the two ratios are within 5% of each
other. To test whether applicants from within the company are
prioritised, we can take random samples of applicants from
outside and inside the company, and check that the
acceptance rate for the latter group is higher. With the results of
these tests in hand, we can reason about which values are
being verified (or not) in each context.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Formalising the Glass Box</title>
      <p>Since AI applications exist in a huge variety of areas, the
formalisation we present is based on predicate logic: it is
domain-agnostic and can be adapted to any application.
Crucially, it is also implementable: the hierarchy of checks,
norms and values can be encoded in logical programming
languages, and the complexity of the system in terms of the
queries that we will pose to it is well within the reach of
current techniques.
4.1</p>
      <sec id="sec-4-1">
        <title>Counts-as</title>
        <p>The interpretation stage entails a translation from abstract
values into concrete norms and requirements comprehensive
enough so that fulfilling the norm will be considered as
adhering to the value, with careful consideration of the context.
Normative systems are often described in deontic-based
languages, which allow for the representation of obligations,
permissions and prohibitions. With this approach, however, we
aim to not only describe the norms themselves, but also the
exact connection between abstract and concrete concepts in
each context.</p>
        <p>Several authors have proposed counts-as statements as a
means to formalise contextual subsumption relations
[Aldewereld et al., 2010]. With this relation, we can build logical
statements of the form: “X counts as Y in context c” [Searle,
1995; Jones and Sergot, 1995]. The semantics of counts-as is
often interpreted in a classificatory light [Grossi et al., 2005],
i.e. “A counts-as B in context c” is interpreted as “A is a
subconcept of B in context c”. Thus, counts-as statements
can be understood as expressing classifications that hold in a
certain context. At the same time, from a different
semantic viewpoint a counts-as operator can be used not only to
express classifications that happen to hold in a context, but
to represent the classifications that define the context itself.
Counts-as can also encode constitutive rules [Grossi et al.,
2008], that is, the rules specifying the ontology that defines
each context.</p>
        <p>To formally represent the hierarchy of functionalities,
requirements, norms and values resulting from the
interpretation stage of the Glass Box approach both outlooks are
necessary. On one hand, contexts are defined by the
connections between more concrete lower level concepts to abstract
values, precisely corresponding to the notion of constitutive
counts-as. On the other hand, once the contexts are
established, we aim to be able to reason about which
combinations of functionalities lead to the fulfillment of each norm
i.e. about the classifications holding in each context. Both
views of counts-as admit compatible representations in modal
logic as shown in [Grossi et al., 2008]: we will use the
formalism and semantics presented there, which we will briefly
introduce in this subsection.</p>
        <p>The logic we will consider is Cxtu; . It is a
multimodal homeogeneous K45 [Blackburn et al., 2007],
extended with a universal context, negations of contexts, and
nominals which denote the states in the semantics.
Definition 1. Language Lun; is given by: a finite set P of
propositional atoms p, an at most countable set N of
nominals denoted by s disjoint from P, and a finite non-empty set
K of n=2 atomic context indexes denoted by c including a
distinguished index u representing the universal context. The
set C of context indexes is given by the elements c of K and
their negations c and its elements are denoted by i; j; : : :
Further, the alphabet of Lun; contains the set of boolean
connectives f:; ^; _; !g and the operators [ ] and h i.</p>
        <p>The set of well-formed formulae of Lun; is given by the
following BNF:
::= ? j p j s j : j 1 ^ 2 j 1 _ 2 j 1 !</p>
        <p>Formulae in which no modal operator occurs are called
objective.</p>
        <p>Logic Cxtu; is axiomatized via the following axioms and
rules schemata:
(P) all tautologies of propositional calculus
(Ki) [i]( 1 !
where i; j are metavariables for the elements of C, c denotes
elements of the set of atomic context indexes K, u is the
universal context index, v ranges over nominals, and in rule
Name denotes a formula in which the nominal denoted by s
does not occur.</p>
        <p>Logics with nominals are called hybrid logics [Blackburn
et al., 2007]: they blur the lines between syntax and
semantics, allowing us to express possible states (semantics)
through formulae (syntax). In this application, the presence
of nominals allows for the definition of rules COVERING and
PACKING, fundamental to capture the concept of the
complement of a context. This becomes clearer when looking at the
semantics: logic Cxtu; enjoys a possible-world semantics
in terms of a particular class of multiframes. In this type of
semantics, we represent the states that are possible in each
context, and consider an interpretation function I which
associates to each propositional atom the set of states which
make it true.</p>
        <p>CXT&gt;;n
Definition 2. A frame F
hW; fWigi2C i where:
- There is a set K such that C = K [ f
cjc 2 Kg ;
- W is a finite set of states (possible worlds) ;
is
a
structure
- fWigi2C is a family of subsets of W such that: there
exists a distinguished u 2 C with Wu = W (there is a
universal context), and such that for every atomic
context c 2 K we have that W c = Wu n Wc .</p>
        <p>A model M for the language Lun; is a pair (F ; I) where
F is a CXT&gt;;n frame and I is a function I : P [ N ! P (W )
such that:
- For all nominals s 2 N, there is a state w such that
I(s) = fwg (the interpretation of a nominal is a single
state) ;
- For all states w 2 W there is a nominal s 2 N such that</p>
        <p>I(s) = fwg (every state has a name) .</p>
        <p>Definition 3. We define satisfaction for CXT&gt;;n frames as
follows:</p>
        <p>M; w</p>
        <p>M; w
M; w</p>
        <p>For a detailed proof, the interested reader is invited to refer
to [Grossi et al., 2008]. The intuitive reading of the semantics
is that W contains all the possible worlds (or states)
considered in the model. For each context, Wc contains the states
that are possible with the added restrictions of the context.
Then, the set of possible worlds in the universal context
coincides with all possible worlds, and the set of possible worlds
for the negation of a context is the complement of its set of
possible worlds.</p>
        <p>The specific requirements on nominals capture the idea that
each nominal is only satisfied in a single state, and that in
every state there is at least one nominal that is satisfied:
nominals can therefore simply be interpreted as names for each
state.</p>
        <p>Logic Cxtu; provides us with the theoretical machinery
to be able to define both classificatory and constitutive
countsas operators, which we will use to build and explore
hierarchies of norms and values.</p>
        <p>Definition 4. Let 1; 2 be objective formulae.</p>
        <p>The classificatory counts-as is statement “ 1 counts as 2
in context c” is formalised in Cxtu; by</p>
        <p>cl
1 )c
2 := [c]( 1 !
2)
:</p>
        <p>Let be a set of formulae, with 1 ! 2 2 . The
constitutive counts-as statement “ 1 counts as 2 by constitution in
the context c defined by ” is formalised in Cxtu; by
co
1 )c;</p>
        <p>Note that for constitutive counts-as statements, both the
name c of the context and the formulae that define it have to
be specified. This corresponds to the notion that constitutive
statements are those statemnts that we take as a definiton for
the context. If c is defined by , an equivalent set of
formulae 0 defines the same context c, but the constitutive
statements that hold in c; 0 are different than those that hold in
c; and correspond to the formulae of 0. On the other hand,
the classificatory statements holding in a context remain the
same no matter which set of equivalent formulae we choose
as its definition, since they don’t define the context but rather
correspond to inferences that hold in it.
The end-result of the interpretation stage is a collection of
contexts, each given by a hierarchy of functionalities and
norms fulfilling values. In this sense, contexts are defined by
the hierachy that holds in them. For this reason, we will
formally define contexts through the set of implications that
define it, which we can then implement via constitutive
countsas statements.</p>
        <p>Furthermore, our aim is for contexts to be hierarchies of
progressively more concrete terms. For this reason we need to
partition our language, given by the set of propositional atoms
we work with, into levels. Intuitively, given a hierarchy of
norms and values, we will assign a level to each propositional
atom it is composed of, corresponding to its position in the
hierarchy in terms of concreteness. In addition, this allows
for the use of different vocabulary for each level. Contexts are
then formed by defining how the propositional atoms of level
i are related to atoms representing more abstract concepts at
level i 1.</p>
        <p>Definition 5. Let PI be a set of propositional atoms.</p>
        <p>Given a subset S PI we denote by pS the elements of
S and by S the objective formulae built on the propositional
atoms of S, given by</p>
        <p>S ::= pS j : S</p>
        <p>S S S S S
j 1 ^ 2 j 1 _ 2 j 1 !
2S :</p>
        <p>A hierarchy is a partition P = fP0; : : : ; PN g of PI i.e. a
collection of sets such that P0 t t PN = PI .</p>
        <p>An interpretation context c in a hierarchy P is given by:
- a collection of subsets Fic
Pi, 0
i</p>
        <p>N ;
- a collection c of objective formulae of the form
Fic+1 ! p i</p>
        <p>F c
such that for every pFic , 0
such formula in c.</p>
        <p>i &lt; N there is at least one
When referring to an interpretation context, we will often
abuse language and omit the family of subsets of the
partition included in its definition, as it is recoverable from c.</p>
        <p>Let P be a hierarchy on a set PI . An interpretation box is
a finite collection K of interpretation contexts in P.</p>
        <p>With this characterisation, we represent the hierarchy of
concepts, from most abstract to most concrete, as a partition.
Elements of P0 correspond to values and elements of PN
correspond to functionalities. Each interpretation context c is
given by explicitly stating the relationships from more
concrete to more abstract concepts by specifying them in c.</p>
        <p>Note that at the interpretation stage the lowest level of the
hierarchy defining the context is given by functionalities and
not by the verification procedures. These are designed and
seamlessly incorporated to the Glass Box in the second stage
of the process, allowing for a modular approach.
The observation stage consists on checking that the
lowerlevel norms devised at the interpretation stage are in fact
adhered to. Even if we restrict tests to constraints on the
inputs and outputs of a system, they can encode a number of
complex behaviours, from obliging the input or output to stay
within certain parameters, to imposing a certain relationship
between the input and output as a function of each other, to
comparing the inputs and outputs to other similar cases.
Furthermore, the tests need to be computationally checkable in
a reasonable time. Once devised, these tests will be
translated into propositional variables that will encode whether a
test has failed or has passed. The results of these tests will
be entered into the Glass Box by means of these variables,
which we can then use to reason about whether a value has
been verified in a certain context. In this stage we therefore
need to specify which tests are associated with each low-level
norm in each context, and how.</p>
        <p>Definition 6. Let PO be a finite set of binary predicates. We
denote by pPO the elements of PO and by PO objective
formulae built on the propositional atoms of PO and ^ and _,
given by</p>
        <p>PO ::= pPO j :</p>
        <p>PO j 1PO ^ 2PO j 1PO _ 2PO :</p>
        <p>Let c be an interpretation context on a partition P of set PI
given by a collection of subsets Fic Pi, 0 i N and a
set of objective formulae c.</p>
        <p>A testing context c for c is a collection of objective
formulae of the form</p>
        <p>PO</p>
        <p>! pFNc 1
such that for every pFNc 1 2 FNc 1 there is at least one such
formula.</p>
        <p>An observation box on PO associated to an interpretation
box fc 2 Kg is given by a set f cjc 2 Kg where each c is
a testing context for c.</p>
        <p>Notice that we don’t consider implication in the vocabulary
of tests, since we will operate with concrete test results that
return either “pass” or “fail”, and it theoretically makes no
semantic sense, given a specific outcome of the testing, to
reason in general about whether a certain test result implies
another test result.
4.4</p>
      </sec>
      <sec id="sec-4-2">
        <title>Reasoning inside the Glass Box</title>
        <p>Interpretation and observation boxes contain all the
implications that define each context. We can now use counts-as
to build a framework that will allow us to reason about the
statements that hold in each context. Given an
interpretation box and an associated observation box, following
Definition 1 we will build a language on the propositional atoms
of P = PI t PO and the context labels in K.</p>
        <p>Additionally, we will need to specify a set N of nominals
denoting every possible world that we consider in our model.
Following the semantics of Definition 2, this set corresponds
to the set of states that are possible within the universal
context. Since all the restrictions in our framework are
contextual and not universal, all the truth value assignments for the
elements of P can hold in the universal context. Thus we
will define N as a set of 2jPj elements, allowing for a
one-toone correspondence between elements of N and all possible
worlds, i.e. truth value assignments, in the semantics.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Definition 7. A Glass Box is given by:</title>
        <p>- A set of propositional atoms P = PI t PO;
- An interpretation box fc 2 Kg on a hierarchy P on PI ;
- An associated Glass observation box f cjc 2
PO;
Kg on
- A set N of 2jPj elements.</p>
        <p>Given a Glass Box, we can build language Lun; on P, N
and K0 = K [ fug, where u is an additional context name,
following Definition 1. We consider logic Cxtu; on this
language.</p>
        <p>For each c 2 K, let c = c [ c. We define the Glass
Box constitution as the conjunction of formulae</p>
        <p>GB :=
^(</p>
        <p>)cc;o c p) :
c2K
!p2 c</p>
        <p>Having encoded the Glass Box in a logical system (see
Figure 2), we can now reason about the statements that hold in
it. With the implementation in mind, we are particularly
interested in classificatory statements, which allow us to describe
for example which combinations of norms count as satisfying
a value in a context. The following definition illustrates some
of the statements which we will want to hold in the Glass
Box.</p>
        <p>Definition 8. We say that an objective formula
patible with context c if
is
incom` GB
! (</p>
        <p>cl
)c ?):
Incompatible formulae imply both a formula and its negation
in context c, and therefore we wish to remove them from the
set of formulae that verify a certain norm or value.</p>
        <p>We say that a combination of functionalities FNc counts as
value pP0 in context c if it is not incompatible with c and
` GB
! ( FNc
)ccl pP0 ) :</p>
        <p>We say that a test result PO verifies value pP0 in context c
if it is not incompatible with c and
` GB
! ( PO
)ccl pP0 ) :</p>
        <p>Formulae incompatible with a certain context correspond
to statements that do not make semantic sense: they may be
contradictory by themselves in this context, or lead to
contradictions within the context by for example, implying that
both a value and its negation are satisfied.</p>
        <p>Crucially for an effective implementation, given a certain
test result, we want to answer the question of whether this
result verifies a certain value in a given context. Thus we
need to find a proof of GB ! ( PO )ccl pP0 ), or to show
that there is no such proof. We therefore need to address the
issue of the search-complexity of our system.</p>
        <p>(Multi)modal logics with a universal modality have an
EXPTIME-complete K-satisfaction problem
[Hemaspaandra, 1996] and adding nominals maintains this bound
[Areces, 2004]. For our system to be suitable for an
implementation, we need to show that the specific queries we will be
Values
counts as
Norms
counts as
counts as</p>
        <p>Requirements</p>
        <p>counts as
Functionalities
Tests
posing are answerable in a reasonable time. Furthermore, it
would be desirable to be able to solve the satisfiability
problems we will pose with existing tools.</p>
        <p>To address both of these points, we will show that
answering questions of the form “does count as 0 in context c
in the Glass Box?” is equivalent to checking whether the
implication ! 0 holds propositionally with the assumptions
of c. This is in fact a very intuitive result: the only
constraints on a context c in the Glass Box are those set-up by its
definition through c, and therefore any deduction in context
c only needs to consider these constraints. We therefore
reduce our question to a satisfiability problem in propositional
logic with a finite number of propositions. In real-life
applications, our human-made vocabulary for norms and values
should remain reasonably small. Additionally, the number of
tests performed needs to remain relatively small as well for
computational reasons. Thus answering queries in our
propositional language should easily remain well within the reach
of SAT-solvers and answer set programming approaches.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Proposition 2. Let</title>
        <p>be an objective formula. We have that
` GB
! [c] iff `
c !
:
Proof. Right to left is easy to see. It is given by the following
deduction:
- I : P ! P (W ) assigns to each propositional atom the
set of states where its assignment is true ;
- I : N ! P (W ) is a one-to-one assignment between the
elements of N and the elements of W .</p>
        <p>M is a model for the language Lun; .</p>
        <p>If we assume that ` GB ! [c] holds in logic Cxtu; ,
then by soundness M GB ! [c] . Furthermore, it is easy
to see that M GB, from the definition of M. Therefore
M [c] holds.</p>
        <p>Thus, by definition, 8w0 2 Wc : M; w0 . Therefore,
in every truth-value assignment where c holds, also holds
i.e. c ! holds propositionally.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Discussion</title>
      <p>The Glass Box approach is both an approach to software
development, a verification method and a source of high-level
transparency for intelligent systems. It provides a modular
approach integrating verification with value-based design.</p>
      <p>Achieving trustworthy AI systems is a multifaceted
complex process, which requires both technical and socio-legal
initiatives and solutions to ensure that we always align an
intelligent system’s goals with human values. Core values, as
well as the processes used for value elicitation, must be made
explicit and that all stakeholders are involved in this process.
Furthermore, the methods used for the elicitation processes
and the decisions of who is involved in the value
identification process are clearly identified and documented. Similarly,
all design decisions and options must also be explicitly
reported; linking system features to the social norms and
values that motivate or are affected by them. This should always
be done in ways that provide inspection capabilities —and,
hence, traceability— for code and data sources to ensure that
data provenance is open and fair.</p>
      <p>The formalisation we presented in this paper allows for
implementation while remaining highly versatile: this approach
is not only useful for black boxes, as more information can
easily be included in the hierarchy and the testing.
Furthermore, by including a universal context, we can easily include
universal context-free statements that may hold in particular
applications. We aim to expand it into concrete
implementations in answer set programming. Beyond concrete
implementations, further work will include studying the effects of
this type of value-oriented transparency.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work was partially supported by the Wallenberg AI,
Autonomous Systems and Software Program (WASP) funded by
the Knut and Alice Wallenberg Foundation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Aldewereld et al.,
          <year>2010</year>
          ]
          <string-name>
            <given-names>H.</given-names>
            <surname>Aldewereld</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. A</surname>
          </string-name>
          ´ lvarezNapagao,
          <string-name>
            <given-names>F.P.M.</given-names>
            <surname>Dignum</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Va</surname>
          </string-name>
          <article-title>´zquez-Salceda. Making Norms Concrete</article-title>
          .
          <source>In Proc. of 9th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS</source>
          <year>2010</year>
          ), pages
          <fpage>807</fpage>
          -
          <lpage>814</lpage>
          , Toronto, Canada,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>[Aler</surname>
          </string-name>
          Tubella et al.,
          <year>2019</year>
          ]
          <string-name>
            <given-names>A.</given-names>
            <surname>Aler Tubella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Theodorou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.P.M.</given-names>
            <surname>Dignum</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Dignum</surname>
          </string-name>
          .
          <article-title>Governance by GlassBox: Implementing Transparent Moral Bounds for AI Behaviour</article-title>
          .
          <source>In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI'2019)</source>
          ,
          <year>2019</year>
          . To appear.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[Ananny and Crawford</source>
          , 2018]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ananny</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Crawford</surname>
          </string-name>
          .
          <article-title>Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability</article-title>
          .
          <source>New Media &amp; Society</source>
          ,
          <volume>20</volume>
          (
          <issue>3</issue>
          ):
          <fpage>973</fpage>
          -
          <lpage>989</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>[Areces</source>
          , 2004]
          <string-name>
            <given-names>C</given-names>
            <surname>Areces</surname>
          </string-name>
          .
          <article-title>The computational complexity of hybrid temporal logics</article-title>
          .
          <source>Logic Journal of IGPL</source>
          ,
          <volume>8</volume>
          (
          <issue>5</issue>
          ):
          <fpage>653</fpage>
          -
          <lpage>679</lpage>
          , 9
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Blackburn et al.,
          <year>2007</year>
          ]
          <string-name>
            <given-names>P.</given-names>
            <surname>Blackburn</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. van Benthem</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Handbook of modal logic</article-title>
          .
          <source>Elsevier</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Greene et al.,
          <year>2019</year>
          ]
          <string-name>
            <given-names>D.</given-names>
            <surname>Greene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Stark</surname>
          </string-name>
          . Better, nicer, clearer, fairer
          <article-title>: A critical assessment of the movement for ethical artificial intelligence and machine learning</article-title>
          .
          <source>In Proceedings of the 52nd Hawaii International Conference on System Sciences</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Grossi et al.,
          <year>2005</year>
          ]
          <string-name>
            <given-names>D.</given-names>
            <surname>Grossi</surname>
          </string-name>
          ,
          <string-name>
            <surname>J-J.Ch</surname>
            . Meyer, and
            <given-names>F.P.M.</given-names>
          </string-name>
          <string-name>
            <surname>Dignum</surname>
          </string-name>
          .
          <article-title>Modal Logic Investigations in the Semantics of Counts-as</article-title>
          .
          <source>Proceedings of ICAIL'05</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Grossi et al.,
          <year>2008</year>
          ]
          <string-name>
            <given-names>D.</given-names>
            <surname>Grossi</surname>
          </string-name>
          ,
          <string-name>
            <surname>J-J.Ch</surname>
            . Meyer, and
            <given-names>F.P.M.</given-names>
          </string-name>
          <string-name>
            <surname>Dignum</surname>
          </string-name>
          .
          <article-title>The many faces of counts-as: A formal analysis of constitutive rules</article-title>
          .
          <source>Journal of Applied Logic</source>
          ,
          <volume>6</volume>
          (
          <issue>2</issue>
          ):
          <fpage>192</fpage>
          -
          <lpage>217</lpage>
          , 6
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Hemaspaandra</source>
          , 1996]
          <string-name>
            <given-names>Edith</given-names>
            <surname>Hemaspaandra</surname>
          </string-name>
          .
          <source>The Price of Universality. Notre Dame Journal of Formal Logic</source>
          ,
          <volume>37</volume>
          (
          <issue>2</issue>
          ),
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Jones and Sergot</source>
          , 1995]
          <string-name>
            <given-names>A.J.I.</given-names>
            <surname>Jones</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Sergot</surname>
          </string-name>
          .
          <article-title>A Formal Characterisation of Institutionalised Power</article-title>
          .
          <source>Logic Journal of IGPL</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ):
          <fpage>427</fpage>
          -
          <lpage>443</lpage>
          , 6
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [Lepri et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lepri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Oliver</surname>
          </string-name>
          , E. Letouze´,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pentland</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Vinck</surname>
          </string-name>
          . Fair, transparent, and
          <article-title>accountable algorithmic decision-making processes</article-title>
          .
          <source>Philosophy &amp; Technology</source>
          ,
          <volume>31</volume>
          (
          <issue>4</issue>
          ):
          <fpage>611</fpage>
          -
          <lpage>627</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>[Reuters</source>
          , 2018] Reuters.
          <article-title>Amazon ditched AI recruiting tool that favored men for technical jobs</article-title>
          . The Guardian,
          <year>Oct 2018</year>
          . Available at https://www.theguardian.com/technology/2018/oct/ 10/amazon-hiring
          <article-title>-ai-gender-bias-recruiting-engine.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>[Searle</source>
          , 1995]
          <string-name>
            <given-names>J</given-names>
            <surname>Searle.</surname>
          </string-name>
          <article-title>The construction of social reality</article-title>
          . Free Press, New York,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [Theodorou et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>A.</given-names>
            <surname>Theodorou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.H.</given-names>
            <surname>Wortham</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.J.</given-names>
            <surname>Bryson</surname>
          </string-name>
          .
          <article-title>Designing and implementing transparency for real time inspection of autonomous robots</article-title>
          .
          <source>Connection Science</source>
          ,
          <volume>29</volume>
          (
          <issue>3</issue>
          ):
          <fpage>230</fpage>
          -
          <lpage>241</lpage>
          , 7
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>[Turiel</source>
          , 2002]
          <string-name>
            <surname>E. Turiel.</surname>
          </string-name>
          <article-title>The culture of morality: Social development, context, and conflict</article-title>
          . Cambridge University Press,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>[Van de Poel</surname>
          </string-name>
          ,
          <year>2013</year>
          ] I. Van de Poel.
          <article-title>Translating values into design requirements</article-title>
          .
          <source>In Philosophy and engineering: Reflections on practice, principles and process</source>
          , pages
          <fpage>253</fpage>
          -
          <lpage>266</lpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [Va´
          <fpage>zquez</fpage>
          -Salceda et al.,
          <year>2007</year>
          ]
          <string-name>
            <given-names>J.</given-names>
            <surname>Va</surname>
          </string-name>
          <article-title>´zquez-</article-title>
          <string-name>
            <surname>Salceda</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Aldewereld</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Grossi</surname>
            , and
            <given-names>F.P.M.</given-names>
          </string-name>
          <string-name>
            <surname>Dignum</surname>
          </string-name>
          .
          <article-title>From human regulations to regulated software agents' behavior</article-title>
          .
          <source>Artificial Intelligence and Law</source>
          ,
          <volume>16</volume>
          (
          <issue>1</issue>
          ):
          <fpage>73</fpage>
          -
          <lpage>87</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>