=Paper=
{{Paper
|id=Vol-2052/paper13
|storemode=property
|title=The Advice Taker 2.0
|pdfUrl=https://ceur-ws.org/Vol-2052/paper13.pdf
|volume=Vol-2052
|authors=Loizos Michael
|dblpUrl=https://dblp.org/rec/conf/commonsense/Michael17
}}
==The Advice Taker 2.0==
The Advice Taker 2.0
Loizos Michael
Open University of Cyprus
loizos@ouc.ac.cy
Abstract learning theory, both of which have dealt with the problem of
symbolic knowledge manipulation; (ii) to establish formal re-
Starting from McCarthy’s seminal proposal for the sults on the realizability of constructing the advice taker; and
construction of an advice-taking machine, we show (iii) to present an explicit example of such a construction.
how tools from abstract argumentation and compu-
tational learning can help disambiguate and forma- 1.1 Overview
lize that goal. Our focus is on establishing provable
guarantees on the realizability of the advice taker, We introduce in Sections 2–4 a view of the advice taker as
and on implementing an actual prototype system. an online learning machine that seeks to approximate a target
theory known to an advice giver (e.g., a human) interacting
with the advice taker, in a manner that offers PAC-like gua-
1 Introduction rantees [Valiant, 1984]. Both the target theory and the advice
However, certain elementary verbal reasoning processes so simple taker’s hypothesis are formalized in Sections 6–7 through a
that they can be carried out by any non-feeble minded human have novel structured argumentation framework. The framework
yet to be simulated by machine programs. — McCarthy, 1959 accounts for the representation of arguments that entail not
The construction of the advice taker as a means towards the only inferences, but also actions that the advice taker is assu-
endowment of machines with intelligent behavior was put fo- med to execute. Unlike work in reinforcement learning, ho-
rward by McCarthy [1959] both as a complement to the use of wever, the learning of these action-entailing arguments does
programming — the then-prevalent mode of human-machine not presuppose the execution of the actions and the receipt of
interaction — and as an intermediate milestone to the even- a reward indicating their appropriateness in a given context.
tual use of learning from experiences. Programming has cer- Instead, a supervised learning approach is used to learn all ar-
tainly stuck around as a key way of interacting with machines, guments, both inference-entailing and action-entailing ones.
while the recent seemingly uncanny ability of learning-based Facilitated by the underlying structured argumentation fra-
machines to outperform human experts across tasks and do- mework, and unlike what a straightforward application of the
mains might suggest that McCarthy’s agenda is now dated. PAC semantics would suggest: the advice taker does not only
This might have been the case had it not been for a new im- predict the label of an example — which would correspond to
petus on the viability of modern machine-learning techniques an extension of the advice taker’s hypothesis — but also pro-
(such as deep learning) to produce human-like intelligence: vides an explanation on how the prediction is reached; corre-
their lack of operational transparency1 , and their brittleness spondingly, the advice giver does not return the correct label
— which, somewhat ironically, was the issue plaguing sy- — which would correspond to an extension of the target the-
stems based on classical logic (like McCarthy’s advice taker) ory — but instead offers advice, which generally corresponds
that machine learning was supposed to address — as a result to only a part of an argument, and is both smaller than the full
of their subtle dependency on biases in their training data. correct label, and cognitively easier for a human to construct.
We do not intend to debate whether the currently-trending Our approach, thus, takes a step back from a pure learning-
machine-learning paradigm will eventually succeed in ove- based view of the advice taker, and a step towards program-
rcoming this impetus. We do contend, however, that McCar- ming, by treating each piece of advice as a “code snippet” that
thy’s proposal for the development of an advice taker still re- the advice taker integrates into its current hypothesis. We qu-
mains, today, relevant as one way of thinking about these is- antify the necessary information content of these pieces of
sues, and perhaps making progress towards addressing them. advice in Section 5, and show in Section 8 that PAC-like gu-
In this work we revisit McCarthy’s [1959] proposal see- arantees can indeed be established for a natural advice giver
king to make three contributions: (i) to show that certain key, and a simple integration policy. An advice taker that adopts
but admittedly under-specified, aspects of the original propo- this particular integration policy is fully implemented in SWI-
sal can be made concrete through formal argumentation and Prolog, while a visual web-based interface is under continual
development, with the current version being accessible online
1
Randall Munroe’s depiction: https://xkcd.com/1838/. from: http://cognition.ouc.ac.cy/prudens/.
2 The Advice Taker and the Advice Giver the advice giver ends up endowing the former with subjective
knowledge that reflects the beliefs and biases of the latter. We
[. . . ] its behavior will be improvable merely by making statements
will, indeed, embrace this subjectivity of the knowledge for
to it [that] will require little if any knowledge of the program or
most of this paper, and we will briefly revisit at the end the
the previous knowledge of the advice taker. — McCarthy, 1959
question of how commonsense knowledge can be acquired by
Unlike most contemporary work on Knowledge Representa- the advice taker through some form of autonomous learning.
tion and Reasoning, which tacitly assumes that knowledge is
available a priori, we consider the involvement of an explicit
advice giver in an online knowledge acquisition process. 3 How Should the Advice Taker Elaborate?
We start with the supposition that the advice taker follows One will be able to assume that the advice taker will have availa-
an elaborate-integrate cycle. Each cycle is initiated by an ex- ble to it a fairly wide class of immediate logical consequences of
ternal or internal stimulus, such as an incoming query by a anything it is told and its previous knowledge. — McCarthy, 1959
user, an event occurrence, the lapse of time, etc. The advice
taker continues to sense the current context, which it uses as The advice taker elaborates a context by producing inferences
input to the processes of that cycle. By context we mean any- and actions, supported by a set of derivations that are returned
thing visible to the advice giver that might be relevant for the by its elaboration theory. Whether an elaboration is meaning-
advice taker to consider during that cycle, including data that ful boils down to how the elaboration theory is revised after
the advice taker may produce through its internal processes. a piece of advice is received by the advice taker. We shall
As this need not be specified more concretely for the purpo- require, then, that the advice taker works towards ensuring
ses of the current section, we shall simply assume, for now, that its elaboration theory eventually is such that t(x) is acce-
that a context x ∈ X is a finitely-representable piece of data. ptable to the advice giver (i.e., f (x, t(x)) = ∅) sufficiently
The advice taker proceeds to elaborate the information in frequently. To make this intuition concrete, we proceed to qu-
context x by drawing inferences that follow from its current antify the terms “frequently”, “sufficiently”, and“eventually”.
knowledge and x. Some of the elaborations may correspond We do not interpret “frequently” narrowly in terms of some
to actions that the advice taker executes, including the action fraction of contexts that needs to be elaborated in an accepta-
of simply reporting to a user some or all of the inferences that ble fashion. Indeed, it might be the case that certain contexts
were drawn. Each elaboration is accompanied by an expla- occur rarely, and hence are practically inconsequential, while
nation on how it was reached, which is made available to the a small fraction of contexts might occur frequently enough so
advice giver. With access to those explanations, the advice that their acceptable elaboration would be critical. To quan-
giver proceeds to offer advice to the advice taker, and the lat- tify this situation, we shall assume that contexts are drawn in-
ter integrates that advice into its current knowledge, which is dependently at random from some fixed, but unknown to the
later used to elaborate contexts in subsequent cycles. advice taker, probability distribution P over contexts in X .
As is the case for context, a high-level abstract view of We acknowledge that this independence assumption might
the data involved during elaboration and integration suffices not always hold in certain realistic scenarios, where contexts
for now, and a more concrete proposal will be made in the across different cycles of interaction might end up being cor-
sequel. We shall assume, therefore, that the advice taker and related. Nonetheless, we adopt this simplifying assumption if
the advice giver share a set D of finitely-representable deriva- only for the purposes of formal analysis of the interaction.
tions, which can be thought to be determined by a previously Given the above notion of frequency, we say that an elabo-
agreed-upon syntax, such as a fragment of first-order logic. ration theory t : X → 2D is (1 − ε)-approximately accepta-
During the elaboration step, the advice taker maps the cur- ble under P against an advice giver with feedback function
rent context x into a set of derivations, through an elaboration f if f (x, t(x)) = ∅ for a context x drawn from P, except
theory t : X → 2D that it maintains and revises across cycles. with probability at most ε over the randomness of sampling
The derivations in t(x) determine the inferences and actions from P. Instead of fixing “sufficiently” to correspond to a
that follow from x and their corresponding explanations. particular value of ε, we shall let the advice taker receive an
During the integration step, the advice giver computes and arbitrary value for ε ∈ (0, 1] as a parameter when it is initiali-
offers a finitely-representable advice α ∈ A to the advice ta- zed, and we will expect the advice taker to eventually produce
ker, by applying a feedback function f : X ×2D → A on the a (1 − ε)-approximately acceptable elaboration theory.
current context x and the advice taker’s elaboration t(x). In This leads to the question of how long “eventually” is; how
turn, the advice taker integrates advice α into its current ela- long are we willing to allow the advice taker to be in a trai-
boration theory t to produce a revised version. Although we ning mode where its elaborations are not sufficiently frequen-
shall not specify, for the time being, how each piece of advice tly accepted, before producing a desirable elaboration theory.
is integrated, we shall make provisions for A to include the We are not suggesting that the advice taker should ever stop
symbol ‘∅’ indicating that the advice giver has “no advice” accepting advice and integrating it in its elaboration theory;
to give during that cycle, or, equivalently, that it finds the ela- indeed, we view the elaborate-integrate cycle as a perpetual
boration t(x) of the advice taker acceptable, meaning that the process. Rather, we are looking only to quantify the length of
advice taker is not expected to revise its current elaboration an initial period during which the advice taker’s elaborations
theory. We shall further assume that for each context x, there should still be under scrutiny by the advice giver, much like
exists at least one elaboration e(x) ∈ 2D that is acceptable. how the actions of a novice human assistant are closely mo-
Necessarily, the interaction between the advice taker and nitored during an initial training period of their employment.
On the one hand, the advice taker should be given sufficient 1984], emphasizing tractable operation (through the restricti-
training time to compensate for the burden of its knowledge ons on the training time g), provable guarantees on performa-
acquisition task, which grows with certain task parameters, nce (through the parameters δ, ε that quantify the quality of
such as: the representation size n of the drawn contexts, the the returned elaboration theory), and a crisp specification of
representation size s of the advice giver’s feedback function, the interaction between the advice taker and the advice giver.
the value of 1/ε (since the task becomes harder as ε beco- Definition 4.1 (PAC Advice Takers). An advice taker is pro-
mes smaller). We shall assume, therefore, that the training bably approximately conformant (PAC) to an advice giver
time is some function g(1/ε, n, s). On the other hand, allow- with a feedback function in a class F (with fixed sets X , D,
ing g(1/ε, n, s) to grow arbitrarily (e.g., exponentially) in its A) if for every real values δ, ε ∈ (0, 1], every probability
parameters would entail a computationally overly-powerful distribution P over contexts in X that have a finite represen-
advice taker, and would impose unrealistic demands on the tation size of at most n, and every feedback function f ∈ F
advice giver. A theoretically-reasonable compromise is to let with a finite representation size of s, the advice taker is given
g(1/ε, n, s) grow polynomially in its parameters. Pragmati- access to the parameters δ, ε, n, F, and proceeds as follows:
cally, the degree of the polynomial needs to be a small value.
Despite our effort to allocate a fair amount of resources to a context x is drawn independently at random from P; the
the advice taker, we have not accounted for the possibility of advice taker uses its current (possibly stochastic) elaboration
the advice taker simply being unlucky! Consider, for exam- theory t : X → 2D to return a set t(x) of derivations; the
ple, a probability distribution P that assigns probability 1/3 advice giver offers the piece of advice f (x, t(x)); the advice
to context x1 and probability 2/3 to context x2 . Suppose, taker integrates the advice to revise the elaboration theory t.
further, that we set ε := 1/2, and expect the advice taker to
After time at most g(1/δ, 1/ε, n, s) the advice taker termi-
produce a 1/2-approximately acceptable elaboration theory t
nates and returns, except with probability at most δ, an elabo-
under P against some advice giver. It follows that t(x2 ) sh-
ration theory t that is (1−ε)-approximately acceptable under
ould eventually be acceptable by the advice giver. But what
P against f . If the running-time function g grows polynomi-
if the advice taker never gets the opportunity to interact with
ally in its parameters, then the advice taker is efficient.
the advice giver in context x2 to get advice on what might be
acceptable in that context? Despite x2 being the most pro- Definition 4.1 captures concisely the requirements on the
bable context under P, the probability of never drawing it is interaction and on the elaboration theory of the advice taker
non-zero, irrespectively of any finitely-long training period. up until the end of the training phase; as we have already di-
Consequently, we slightly downgrade our expectations, so scussed, the interaction and revision of the elaboration theory
that the advice taker may fail to eventually produce a desi- might continue after that. For the purposes of our subsequent
rable elaboration theory, but only with a “sufficiently small” formal analysis, we have allowed the elaboration theory to be
probability δ. As in our treatment of ε, we let the advice taker a stochastic function, so that the advice taker can choose to
receive an arbitrary value for δ ∈ (0, 1] as a parameter when produce random sets of derivations during the training phase,
it is initialized, and we allow it to fail with at most probability if this happens to be beneficial towards its goal of being PAC.
δ. Correspondingly, we also allow the advice taker’s training In a later section we will offer an explicit construction of a
time to depend on (and grow polynomially in) 1/δ, so that the PAC advice taker without appealing to this stochasticity.
advice taker’s training time is a function g(1/δ, 1/ε, n, s). Although the advice taker interacts with a single predeter-
mined advice giver, the latter’s feedback function is generally
4 An Advice Taker with Learning Guarantees unknown to the former; had this not been the case, the advice
taker could simply be assumed to have access to all the advice
Our ultimate objective is to make programs that learn from their the advice giver could provide. On the flip side of things, the
experience as effectively as humans do. — McCarthy, 1959 advice taker can, and does, have some information about cer-
In the preceding section we had quantified the type of guaran- tain aspects of the advice giver’s feedback function (such as
tees that we expect the advice taker to offer on its operation. the set A of all pieces of advice it can possibly return), depen-
As we have already discussed, these guarantees are necessa- ding on the particular scenario in which the interaction takes
rily tied to the advice giver. In a sense, there is no objectively place. This partial knowledge of the advice giver’s feedback
correct way for the advice taker to draw inferences and act, function f is captured in Definition 4.1 by assuming that f
since, in particular, there is no objectively correct elaboration belongs in a class F that is known to the advice taker, while
theory (or knowledge) that the advice taker can have. Instead, leaving open the possibility that f could be any member of
the inferences and actions of an elaboration theory (or know- F, and insisting that the advice taker delivers in all cases.
ledge) can, at best, conform to the advice giver’s feedback. With the aforementioned clarifications, we can now esta-
Even with this shift from objective truth to subjective con- blish formal results on the realizability of a PAC advice taker.
formance, the requirements that we have imposed on the Our first result shows that a (possibly non-efficient) PAC
advice taker — to probably and tractably produce an appro- advice taker always exists, demonstrating, on the one hand,
ximately acceptable elaboration theory — might seem overly the non-vacuousness of Definition 4.1, and highlighting, on
arduous. In this section we seek to bring these requirements the other hand, the importance of insisting on efficiency.
together, and establish results on their realizability. Our for- Theorem 4.1 (Existence of a Universal PAC Advice Taker).
malization adopts and extends the celebrated probably ap- For every class F, there exists an advice taker that is PAC to
proximately correct semantics for concept learning [Valiant, an advice giver with any feedback function that belongs in F.
Proof (sketch). Memorize the first m contexts, and map each Depending, then, on the paucity of initial information that
through t to the next set of derivations in 2D that have not yet the advice taker has on which elaborations the advice giver
been paired with it, until all m elaborations are acceptable. might find acceptable (to the extent that such information fol-
Do not execute any of the actions in the considered elaborati- lows from the advice taker’s knowledge of F), the constraint
ons. The result follows for an appropriate choice of m. on the efficiency of the advice taker has an effect on the rich-
ness of the advice that the advice giver needs to offer.
The advice taker constructed in the proof of Theorem 4.1
does not really utilize any advice from the advice giver other
than for the purpose of checking whether its elaboration of 6 Knowledge Representation and Reasoning
each given context is acceptable. This minimal reliance on the [. . . ] probably have a single rule of inference which will combine
advice is compensated by the long running time of the advice substitution for variables with modus ponens. — McCarthy, 1959
taker, which effectively performs a blind systematic search of
Our emphasis up to this point was on formalizing and quan-
an appropriately selected subspace of all elaboration theories.
tifying certain aspects of the interaction between the advice
taker and the advice giver. Before giving an explicit constru-
5 How Should the Advice Giver Advise? ction of the two actors, we must make concrete the language
In order for a program to be capable of learning something it must in which the advice taker represents the knowledge that it re-
first be capable of being told it. In fact, in the early versions we asons with and the advice giver represents the advice it offers.
shall concentrate entirely on this point [. . . ] — McCarthy, 1959 Recall that the role of the advice taker’s reasoning module
In view of Theorem 4.1, a natural next step is to examine how is not to identify certain objective truths, nor to deduce what
the advice offered by the advice giver can be utilized more is logically entailed by a given context and the available kno-
effectively by the advice taker towards constructing its elabo- wledge. Instead, the role of reasoning is to produce elaborati-
ration theory. Since we are still approaching the interaction ons that are acceptable by the advice giver. This subjectivity
of the advice taker and the advice giver at an abstract level, lessens the burden on the consideration of any particular type
we cannot yet discuss the structure of this advice, but we can, of reasoning semantics for the advice taker, since the intera-
however, examine the amount of information that it conveys. ction with the advice giver can finely-tune the produced ela-
To quantify the amount of such information that might be borations by customizing the former’s knowledge. This is de-
necessary, we first introduce an auxiliary definition. Consider cidedly demonstrated by Theorem 4.1, which establishes the
a partition u of 2D of size |u|, and denote by u[i] its i-th existence of a PAC advice taker without reliance on the reaso-
element in some order. We shall say that u is shattered by a ning semantics. The primary role of the reasoning semantics,
class F of feedback functions if there exists a context x ∈ X then, is to support the flexible revision of the knowledge. Our
such that for every i ∈ {1, 2, . . . , |u|}, there exists a feedback proposal below should, therefore, be read in this spirit, as a
function f ∈ F for which {e | f (x, e) = ∅} ⊆ u[i]. The reasonable candidate for the representational syntax and the
(information) sparsity spr(F) of a class F is the maximum reasoning semantics, but by no means as the only option.
value |u| for a chosen partition u of 2D that is shattered by F. We consider a fixed language L that assigns finite names
Intuitively, the sparsity of a class F captures how little in- to the members of the two non-overlapping sets of constants
formation is conveyed to the advice taker, in a worst-case con- and variables, and defines the following constructs: An atom
text x, towards identifying what is acceptable to the advice is a constant or a variable. A predicate is a constant followed
giver, after the advice taker has successfully eliminated a fe- by a finite parentheses-surrounded comma-separated list of
edback function in F from being the one actually used by the atoms; these atoms are the predicate’s arguments, and their
advice giver. Consequently, the higher the sparsity of F, the multitude is the predicate’s arity. A literal is a predicate or a
less information the advice taker receives after each elimina- predicate prefixed by the symbol ‘¬’. A query is a predicate
tion, and the more feedback functions it needs to eliminate prefixed by the symbol ‘?’. An action is a predicate prefixed
before gathering enough information to meet its goal.2 by the symbol ‘!’. A term is a literal, a query, or an action.
Two variable-free terms are conflicting if they are the literals
Theorem 5.1 (Necessary Feedback for Efficient PAC Advice λ and ¬λ for a predicate λ. Following Prolog convention, we
Takers). If there exists an advice taker that is PAC to an capitalize the first letter of variables but not of constants.
advice giver with any feedback function f : X × 2D → A A context in X is a collection of variable-free literals. A
in class F, and runs in time g(·), then |A|g(·) is Ω(spr(F)). rule in R is an expression of the form ϕ τ, where the
Proof (sketch). An advice taker receives at most g(·) pieces body ϕ of the rule is a finite comma-separated list of literals
of advice and, therefore, at most log |A| · g(·) bits of informa- or queries, and the head τ of the rule is a literal or an action
tion, each eliminating at most half of the remaining parts of u whose variables all appear in some of the terms of ϕ. An
from including the acceptable elaborations according to f . If inferential rule is a rule with a literal head, while a produ-
|A|g(·) ≤ (1/2) · spr(F), then at least two parts of u are not ction rule is a rule with an action head. An instance of a rule
eliminated, giving a t with an expected error of at least 1/2. is an expression obtained from the rule after all variables in
The result follows for appropriate choices of δ and ε. the rule’s terms are replaced with constants, so that all appe-
arances of the same variable across the rule map to the same
2
This measure of the lack of information in a worst-case example constant. Two (inferential) rule instances are conflicting if
differs from VC-dimension [Kearns and Vazirani, 1994], which aims their (variable-free) head terms are conflicting. Two (inferen-
to lower-bound the number of distinct examples needed to learn. tial) rules are conflicting if they have conflicting instances.
A knowledge base κ = h%, i comprises a finite collection We conclude this section with an example knowledge base.
% ⊆ R of rules, and an irreflexive antisymmetric priority re- Example 6.1. A phone assistant implementing a user’s po-
lation that is a subset of % × % and includes pairs of con- licy, with rules in increasing order of priority. Queries refer
flicting (inferential) rules only. For simplicity of exposition, to built-in or easy-to-implement Prolog predicates, and acti-
we may occasionally assume that rules in % are named, and ons refer to Prolog predicates that control phone functions.
define over their names. The priority relation is lifted to
apply on rule instances also, in the natural manner. A know- Policy: Decline calls when busy; e.g., when in a meeting at work.
ledge base is totally-prioritized if for every pair of conflicting Calls from family members are important. Send an SMS when decli-
(inferential) rules r1 , r2 ∈ % either r1 r2 or r2 r1 holds. ning important calls. Repeated calls are urgent; unless coming from
For a totally-prioritized knowledge base, the priority relation “serial” repeated callers. Repeated important calls are urgent. An-
swer urgent calls. Answer calls from colleagues that are expected.
need not be specified explicitly, as it can be determined by the
order in which rules are represented in the knowledge base. my pos(P1), work pos(P2), ?dist(P1,P2,D), ?(D<50) at work
The advice taker reasons with such a knowledge base with calendar(From,To), time(T), ?(From