=Paper= {{Paper |id=Vol-2052/paper13 |storemode=property |title=The Advice Taker 2.0 |pdfUrl=https://ceur-ws.org/Vol-2052/paper13.pdf |volume=Vol-2052 |authors=Loizos Michael |dblpUrl=https://dblp.org/rec/conf/commonsense/Michael17 }} ==The Advice Taker 2.0== https://ceur-ws.org/Vol-2052/paper13.pdf
                                                    The Advice Taker 2.0

                                                        Loizos Michael
                                                    Open University of Cyprus
                                                       loizos@ouc.ac.cy



                             Abstract                                  learning theory, both of which have dealt with the problem of
                                                                       symbolic knowledge manipulation; (ii) to establish formal re-
         Starting from McCarthy’s seminal proposal for the             sults on the realizability of constructing the advice taker; and
         construction of an advice-taking machine, we show             (iii) to present an explicit example of such a construction.
         how tools from abstract argumentation and compu-
         tational learning can help disambiguate and forma-            1.1   Overview
         lize that goal. Our focus is on establishing provable
         guarantees on the realizability of the advice taker,          We introduce in Sections 2–4 a view of the advice taker as
         and on implementing an actual prototype system.               an online learning machine that seeks to approximate a target
                                                                       theory known to an advice giver (e.g., a human) interacting
                                                                       with the advice taker, in a manner that offers PAC-like gua-
1        Introduction                                                  rantees [Valiant, 1984]. Both the target theory and the advice
    However, certain elementary verbal reasoning processes so simple   taker’s hypothesis are formalized in Sections 6–7 through a
    that they can be carried out by any non-feeble minded human have   novel structured argumentation framework. The framework
    yet to be simulated by machine programs. — McCarthy, 1959          accounts for the representation of arguments that entail not
The construction of the advice taker as a means towards the            only inferences, but also actions that the advice taker is assu-
endowment of machines with intelligent behavior was put fo-            med to execute. Unlike work in reinforcement learning, ho-
rward by McCarthy [1959] both as a complement to the use of            wever, the learning of these action-entailing arguments does
programming — the then-prevalent mode of human-machine                 not presuppose the execution of the actions and the receipt of
interaction — and as an intermediate milestone to the even-            a reward indicating their appropriateness in a given context.
tual use of learning from experiences. Programming has cer-            Instead, a supervised learning approach is used to learn all ar-
tainly stuck around as a key way of interacting with machines,         guments, both inference-entailing and action-entailing ones.
while the recent seemingly uncanny ability of learning-based              Facilitated by the underlying structured argumentation fra-
machines to outperform human experts across tasks and do-              mework, and unlike what a straightforward application of the
mains might suggest that McCarthy’s agenda is now dated.               PAC semantics would suggest: the advice taker does not only
   This might have been the case had it not been for a new im-         predict the label of an example — which would correspond to
petus on the viability of modern machine-learning techniques           an extension of the advice taker’s hypothesis — but also pro-
(such as deep learning) to produce human-like intelligence:            vides an explanation on how the prediction is reached; corre-
their lack of operational transparency1 , and their brittleness        spondingly, the advice giver does not return the correct label
— which, somewhat ironically, was the issue plaguing sy-               — which would correspond to an extension of the target the-
stems based on classical logic (like McCarthy’s advice taker)          ory — but instead offers advice, which generally corresponds
that machine learning was supposed to address — as a result            to only a part of an argument, and is both smaller than the full
of their subtle dependency on biases in their training data.           correct label, and cognitively easier for a human to construct.
   We do not intend to debate whether the currently-trending              Our approach, thus, takes a step back from a pure learning-
machine-learning paradigm will eventually succeed in ove-              based view of the advice taker, and a step towards program-
rcoming this impetus. We do contend, however, that McCar-              ming, by treating each piece of advice as a “code snippet” that
thy’s proposal for the development of an advice taker still re-        the advice taker integrates into its current hypothesis. We qu-
mains, today, relevant as one way of thinking about these is-          antify the necessary information content of these pieces of
sues, and perhaps making progress towards addressing them.             advice in Section 5, and show in Section 8 that PAC-like gu-
   In this work we revisit McCarthy’s [1959] proposal see-             arantees can indeed be established for a natural advice giver
king to make three contributions: (i) to show that certain key,        and a simple integration policy. An advice taker that adopts
but admittedly under-specified, aspects of the original propo-         this particular integration policy is fully implemented in SWI-
sal can be made concrete through formal argumentation and              Prolog, while a visual web-based interface is under continual
                                                                       development, with the current version being accessible online
     1
         Randall Munroe’s depiction: https://xkcd.com/1838/.           from: http://cognition.ouc.ac.cy/prudens/.
2      The Advice Taker and the Advice Giver                               the advice giver ends up endowing the former with subjective
                                                                           knowledge that reflects the beliefs and biases of the latter. We
    [. . . ] its behavior will be improvable merely by making statements
                                                                           will, indeed, embrace this subjectivity of the knowledge for
    to it [that] will require little if any knowledge of the program or
                                                                           most of this paper, and we will briefly revisit at the end the
    the previous knowledge of the advice taker. — McCarthy, 1959
                                                                           question of how commonsense knowledge can be acquired by
Unlike most contemporary work on Knowledge Representa-                     the advice taker through some form of autonomous learning.
tion and Reasoning, which tacitly assumes that knowledge is
available a priori, we consider the involvement of an explicit
advice giver in an online knowledge acquisition process.                   3     How Should the Advice Taker Elaborate?
   We start with the supposition that the advice taker follows                 One will be able to assume that the advice taker will have availa-
an elaborate-integrate cycle. Each cycle is initiated by an ex-                ble to it a fairly wide class of immediate logical consequences of
ternal or internal stimulus, such as an incoming query by a                    anything it is told and its previous knowledge. — McCarthy, 1959
user, an event occurrence, the lapse of time, etc. The advice
taker continues to sense the current context, which it uses as             The advice taker elaborates a context by producing inferences
input to the processes of that cycle. By context we mean any-              and actions, supported by a set of derivations that are returned
thing visible to the advice giver that might be relevant for the           by its elaboration theory. Whether an elaboration is meaning-
advice taker to consider during that cycle, including data that            ful boils down to how the elaboration theory is revised after
the advice taker may produce through its internal processes.               a piece of advice is received by the advice taker. We shall
As this need not be specified more concretely for the purpo-               require, then, that the advice taker works towards ensuring
ses of the current section, we shall simply assume, for now,               that its elaboration theory eventually is such that t(x) is acce-
that a context x ∈ X is a finitely-representable piece of data.            ptable to the advice giver (i.e., f (x, t(x)) = ∅) sufficiently
   The advice taker proceeds to elaborate the information in               frequently. To make this intuition concrete, we proceed to qu-
context x by drawing inferences that follow from its current               antify the terms “frequently”, “sufficiently”, and“eventually”.
knowledge and x. Some of the elaborations may correspond                      We do not interpret “frequently” narrowly in terms of some
to actions that the advice taker executes, including the action            fraction of contexts that needs to be elaborated in an accepta-
of simply reporting to a user some or all of the inferences that           ble fashion. Indeed, it might be the case that certain contexts
were drawn. Each elaboration is accompanied by an expla-                   occur rarely, and hence are practically inconsequential, while
nation on how it was reached, which is made available to the               a small fraction of contexts might occur frequently enough so
advice giver. With access to those explanations, the advice                that their acceptable elaboration would be critical. To quan-
giver proceeds to offer advice to the advice taker, and the lat-           tify this situation, we shall assume that contexts are drawn in-
ter integrates that advice into its current knowledge, which is            dependently at random from some fixed, but unknown to the
later used to elaborate contexts in subsequent cycles.                     advice taker, probability distribution P over contexts in X .
   As is the case for context, a high-level abstract view of               We acknowledge that this independence assumption might
the data involved during elaboration and integration suffices              not always hold in certain realistic scenarios, where contexts
for now, and a more concrete proposal will be made in the                  across different cycles of interaction might end up being cor-
sequel. We shall assume, therefore, that the advice taker and              related. Nonetheless, we adopt this simplifying assumption if
the advice giver share a set D of finitely-representable deriva-           only for the purposes of formal analysis of the interaction.
tions, which can be thought to be determined by a previously                  Given the above notion of frequency, we say that an elabo-
agreed-upon syntax, such as a fragment of first-order logic.               ration theory t : X → 2D is (1 − ε)-approximately accepta-
   During the elaboration step, the advice taker maps the cur-             ble under P against an advice giver with feedback function
rent context x into a set of derivations, through an elaboration           f if f (x, t(x)) = ∅ for a context x drawn from P, except
theory t : X → 2D that it maintains and revises across cycles.             with probability at most ε over the randomness of sampling
The derivations in t(x) determine the inferences and actions               from P. Instead of fixing “sufficiently” to correspond to a
that follow from x and their corresponding explanations.                   particular value of ε, we shall let the advice taker receive an
   During the integration step, the advice giver computes and              arbitrary value for ε ∈ (0, 1] as a parameter when it is initiali-
offers a finitely-representable advice α ∈ A to the advice ta-             zed, and we will expect the advice taker to eventually produce
ker, by applying a feedback function f : X ×2D → A on the                  a (1 − ε)-approximately acceptable elaboration theory.
current context x and the advice taker’s elaboration t(x). In                 This leads to the question of how long “eventually” is; how
turn, the advice taker integrates advice α into its current ela-           long are we willing to allow the advice taker to be in a trai-
boration theory t to produce a revised version. Although we                ning mode where its elaborations are not sufficiently frequen-
shall not specify, for the time being, how each piece of advice            tly accepted, before producing a desirable elaboration theory.
is integrated, we shall make provisions for A to include the               We are not suggesting that the advice taker should ever stop
symbol ‘∅’ indicating that the advice giver has “no advice”                accepting advice and integrating it in its elaboration theory;
to give during that cycle, or, equivalently, that it finds the ela-        indeed, we view the elaborate-integrate cycle as a perpetual
boration t(x) of the advice taker acceptable, meaning that the             process. Rather, we are looking only to quantify the length of
advice taker is not expected to revise its current elaboration             an initial period during which the advice taker’s elaborations
theory. We shall further assume that for each context x, there             should still be under scrutiny by the advice giver, much like
exists at least one elaboration e(x) ∈ 2D that is acceptable.              how the actions of a novice human assistant are closely mo-
   Necessarily, the interaction between the advice taker and               nitored during an initial training period of their employment.
    On the one hand, the advice taker should be given sufficient       1984], emphasizing tractable operation (through the restricti-
training time to compensate for the burden of its knowledge            ons on the training time g), provable guarantees on performa-
acquisition task, which grows with certain task parameters,            nce (through the parameters δ, ε that quantify the quality of
such as: the representation size n of the drawn contexts, the          the returned elaboration theory), and a crisp specification of
representation size s of the advice giver’s feedback function,         the interaction between the advice taker and the advice giver.
the value of 1/ε (since the task becomes harder as ε beco-             Definition 4.1 (PAC Advice Takers). An advice taker is pro-
mes smaller). We shall assume, therefore, that the training            bably approximately conformant (PAC) to an advice giver
time is some function g(1/ε, n, s). On the other hand, allow-          with a feedback function in a class F (with fixed sets X , D,
ing g(1/ε, n, s) to grow arbitrarily (e.g., exponentially) in its      A) if for every real values δ, ε ∈ (0, 1], every probability
parameters would entail a computationally overly-powerful              distribution P over contexts in X that have a finite represen-
advice taker, and would impose unrealistic demands on the              tation size of at most n, and every feedback function f ∈ F
advice giver. A theoretically-reasonable compromise is to let          with a finite representation size of s, the advice taker is given
g(1/ε, n, s) grow polynomially in its parameters. Pragmati-            access to the parameters δ, ε, n, F, and proceeds as follows:
cally, the degree of the polynomial needs to be a small value.
    Despite our effort to allocate a fair amount of resources to       a context x is drawn independently at random from P; the
the advice taker, we have not accounted for the possibility of         advice taker uses its current (possibly stochastic) elaboration
the advice taker simply being unlucky! Consider, for exam-             theory t : X → 2D to return a set t(x) of derivations; the
ple, a probability distribution P that assigns probability 1/3         advice giver offers the piece of advice f (x, t(x)); the advice
to context x1 and probability 2/3 to context x2 . Suppose,             taker integrates the advice to revise the elaboration theory t.
further, that we set ε := 1/2, and expect the advice taker to
                                                                          After time at most g(1/δ, 1/ε, n, s) the advice taker termi-
produce a 1/2-approximately acceptable elaboration theory t
                                                                       nates and returns, except with probability at most δ, an elabo-
under P against some advice giver. It follows that t(x2 ) sh-
                                                                       ration theory t that is (1−ε)-approximately acceptable under
ould eventually be acceptable by the advice giver. But what
                                                                       P against f . If the running-time function g grows polynomi-
if the advice taker never gets the opportunity to interact with
                                                                       ally in its parameters, then the advice taker is efficient.
the advice giver in context x2 to get advice on what might be
acceptable in that context? Despite x2 being the most pro-                 Definition 4.1 captures concisely the requirements on the
bable context under P, the probability of never drawing it is          interaction and on the elaboration theory of the advice taker
non-zero, irrespectively of any finitely-long training period.         up until the end of the training phase; as we have already di-
    Consequently, we slightly downgrade our expectations, so           scussed, the interaction and revision of the elaboration theory
that the advice taker may fail to eventually produce a desi-           might continue after that. For the purposes of our subsequent
rable elaboration theory, but only with a “sufficiently small”         formal analysis, we have allowed the elaboration theory to be
probability δ. As in our treatment of ε, we let the advice taker       a stochastic function, so that the advice taker can choose to
receive an arbitrary value for δ ∈ (0, 1] as a parameter when          produce random sets of derivations during the training phase,
it is initialized, and we allow it to fail with at most probability    if this happens to be beneficial towards its goal of being PAC.
δ. Correspondingly, we also allow the advice taker’s training          In a later section we will offer an explicit construction of a
time to depend on (and grow polynomially in) 1/δ, so that the          PAC advice taker without appealing to this stochasticity.
advice taker’s training time is a function g(1/δ, 1/ε, n, s).              Although the advice taker interacts with a single predeter-
                                                                       mined advice giver, the latter’s feedback function is generally
4     An Advice Taker with Learning Guarantees                         unknown to the former; had this not been the case, the advice
                                                                       taker could simply be assumed to have access to all the advice
    Our ultimate objective is to make programs that learn from their   the advice giver could provide. On the flip side of things, the
    experience as effectively as humans do. — McCarthy, 1959           advice taker can, and does, have some information about cer-
In the preceding section we had quantified the type of guaran-         tain aspects of the advice giver’s feedback function (such as
tees that we expect the advice taker to offer on its operation.        the set A of all pieces of advice it can possibly return), depen-
As we have already discussed, these guarantees are necessa-            ding on the particular scenario in which the interaction takes
rily tied to the advice giver. In a sense, there is no objectively     place. This partial knowledge of the advice giver’s feedback
correct way for the advice taker to draw inferences and act,           function f is captured in Definition 4.1 by assuming that f
since, in particular, there is no objectively correct elaboration      belongs in a class F that is known to the advice taker, while
theory (or knowledge) that the advice taker can have. Instead,         leaving open the possibility that f could be any member of
the inferences and actions of an elaboration theory (or know-          F, and insisting that the advice taker delivers in all cases.
ledge) can, at best, conform to the advice giver’s feedback.               With the aforementioned clarifications, we can now esta-
   Even with this shift from objective truth to subjective con-        blish formal results on the realizability of a PAC advice taker.
formance, the requirements that we have imposed on the                     Our first result shows that a (possibly non-efficient) PAC
advice taker — to probably and tractably produce an appro-             advice taker always exists, demonstrating, on the one hand,
ximately acceptable elaboration theory — might seem overly             the non-vacuousness of Definition 4.1, and highlighting, on
arduous. In this section we seek to bring these requirements           the other hand, the importance of insisting on efficiency.
together, and establish results on their realizability. Our for-       Theorem 4.1 (Existence of a Universal PAC Advice Taker).
malization adopts and extends the celebrated probably ap-              For every class F, there exists an advice taker that is PAC to
proximately correct semantics for concept learning [Valiant,           an advice giver with any feedback function that belongs in F.
Proof (sketch). Memorize the first m contexts, and map each                   Depending, then, on the paucity of initial information that
through t to the next set of derivations in 2D that have not yet           the advice taker has on which elaborations the advice giver
been paired with it, until all m elaborations are acceptable.              might find acceptable (to the extent that such information fol-
Do not execute any of the actions in the considered elaborati-             lows from the advice taker’s knowledge of F), the constraint
ons. The result follows for an appropriate choice of m.                    on the efficiency of the advice taker has an effect on the rich-
                                                                           ness of the advice that the advice giver needs to offer.
   The advice taker constructed in the proof of Theorem 4.1
does not really utilize any advice from the advice giver other
than for the purpose of checking whether its elaboration of                6      Knowledge Representation and Reasoning
each given context is acceptable. This minimal reliance on the                 [. . . ] probably have a single rule of inference which will combine
advice is compensated by the long running time of the advice                   substitution for variables with modus ponens. — McCarthy, 1959
taker, which effectively performs a blind systematic search of
                                                                           Our emphasis up to this point was on formalizing and quan-
an appropriately selected subspace of all elaboration theories.
                                                                           tifying certain aspects of the interaction between the advice
                                                                           taker and the advice giver. Before giving an explicit constru-
5        How Should the Advice Giver Advise?                               ction of the two actors, we must make concrete the language
    In order for a program to be capable of learning something it must     in which the advice taker represents the knowledge that it re-
    first be capable of being told it. In fact, in the early versions we   asons with and the advice giver represents the advice it offers.
    shall concentrate entirely on this point [. . . ] — McCarthy, 1959        Recall that the role of the advice taker’s reasoning module
In view of Theorem 4.1, a natural next step is to examine how              is not to identify certain objective truths, nor to deduce what
the advice offered by the advice giver can be utilized more                is logically entailed by a given context and the available kno-
effectively by the advice taker towards constructing its elabo-            wledge. Instead, the role of reasoning is to produce elaborati-
ration theory. Since we are still approaching the interaction              ons that are acceptable by the advice giver. This subjectivity
of the advice taker and the advice giver at an abstract level,             lessens the burden on the consideration of any particular type
we cannot yet discuss the structure of this advice, but we can,            of reasoning semantics for the advice taker, since the intera-
however, examine the amount of information that it conveys.                ction with the advice giver can finely-tune the produced ela-
   To quantify the amount of such information that might be                borations by customizing the former’s knowledge. This is de-
necessary, we first introduce an auxiliary definition. Consider            cidedly demonstrated by Theorem 4.1, which establishes the
a partition u of 2D of size |u|, and denote by u[i] its i-th               existence of a PAC advice taker without reliance on the reaso-
element in some order. We shall say that u is shattered by a               ning semantics. The primary role of the reasoning semantics,
class F of feedback functions if there exists a context x ∈ X              then, is to support the flexible revision of the knowledge. Our
such that for every i ∈ {1, 2, . . . , |u|}, there exists a feedback       proposal below should, therefore, be read in this spirit, as a
function f ∈ F for which {e | f (x, e) = ∅} ⊆ u[i]. The                    reasonable candidate for the representational syntax and the
(information) sparsity spr(F) of a class F is the maximum                  reasoning semantics, but by no means as the only option.
value |u| for a chosen partition u of 2D that is shattered by F.              We consider a fixed language L that assigns finite names
   Intuitively, the sparsity of a class F captures how little in-          to the members of the two non-overlapping sets of constants
formation is conveyed to the advice taker, in a worst-case con-            and variables, and defines the following constructs: An atom
text x, towards identifying what is acceptable to the advice               is a constant or a variable. A predicate is a constant followed
giver, after the advice taker has successfully eliminated a fe-            by a finite parentheses-surrounded comma-separated list of
edback function in F from being the one actually used by the               atoms; these atoms are the predicate’s arguments, and their
advice giver. Consequently, the higher the sparsity of F, the              multitude is the predicate’s arity. A literal is a predicate or a
less information the advice taker receives after each elimina-             predicate prefixed by the symbol ‘¬’. A query is a predicate
tion, and the more feedback functions it needs to eliminate                prefixed by the symbol ‘?’. An action is a predicate prefixed
before gathering enough information to meet its goal.2                     by the symbol ‘!’. A term is a literal, a query, or an action.
                                                                           Two variable-free terms are conflicting if they are the literals
Theorem 5.1 (Necessary Feedback for Efficient PAC Advice                   λ and ¬λ for a predicate λ. Following Prolog convention, we
Takers). If there exists an advice taker that is PAC to an                 capitalize the first letter of variables but not of constants.
advice giver with any feedback function f : X × 2D → A                        A context in X is a collection of variable-free literals. A
in class F, and runs in time g(·), then |A|g(·) is Ω(spr(F)).              rule in R is an expression of the form ϕ             τ, where the
Proof (sketch). An advice taker receives at most g(·) pieces               body ϕ of the rule is a finite comma-separated list of literals
of advice and, therefore, at most log |A| · g(·) bits of informa-          or queries, and the head τ of the rule is a literal or an action
tion, each eliminating at most half of the remaining parts of u            whose variables all appear in some of the terms of ϕ. An
from including the acceptable elaborations according to f . If             inferential rule is a rule with a literal head, while a produ-
|A|g(·) ≤ (1/2) · spr(F), then at least two parts of u are not             ction rule is a rule with an action head. An instance of a rule
eliminated, giving a t with an expected error of at least 1/2.             is an expression obtained from the rule after all variables in
The result follows for appropriate choices of δ and ε.                     the rule’s terms are replaced with constants, so that all appe-
                                                                           arances of the same variable across the rule map to the same
     2
     This measure of the lack of information in a worst-case example       constant. Two (inferential) rule instances are conflicting if
differs from VC-dimension [Kearns and Vazirani, 1994], which aims          their (variable-free) head terms are conflicting. Two (inferen-
to lower-bound the number of distinct examples needed to learn.            tial) rules are conflicting if they have conflicting instances.
   A knowledge base κ = h%, i comprises a finite collection              We conclude this section with an example knowledge base.
% ⊆ R of rules, and an irreflexive antisymmetric priority re-         Example 6.1. A phone assistant implementing a user’s po-
lation  that is a subset of % × % and includes pairs of con-         licy, with rules in increasing order of priority. Queries refer
flicting (inferential) rules only. For simplicity of exposition,      to built-in or easy-to-implement Prolog predicates, and acti-
we may occasionally assume that rules in % are named, and             ons refer to Prolog predicates that control phone functions.
define  over their names. The priority relation is lifted to
apply on rule instances also, in the natural manner. A know-          Policy: Decline calls when busy; e.g., when in a meeting at work.
ledge base is totally-prioritized if for every pair of conflicting    Calls from family members are important. Send an SMS when decli-
(inferential) rules r1 , r2 ∈ % either r1  r2 or r2  r1 holds.      ning important calls. Repeated calls are urgent; unless coming from
For a totally-prioritized knowledge base, the priority relation       “serial” repeated callers. Repeated important calls are urgent. An-
                                                                      swer urgent calls. Answer calls from colleagues that are expected.
need not be specified explicitly, as it can be determined by the
order in which rules are represented in the knowledge base.           my pos(P1), work pos(P2), ?dist(P1,P2,D), ?(D<50) at work
   The advice taker reasons with such a knowledge base with           calendar(From,To), time(T), ?(From