=Paper=
{{Paper
|id=Vol-494/paper-34
|storemode=property
|title=Temporal Planning in Dynamic Environments for P-CLAIM Agents
|pdfUrl=https://ceur-ws.org/Vol-494/ladspaper8.pdf
|volume=Vol-494
|dblpUrl=https://dblp.org/rec/conf/mallow/HashmiF09
}}
==Temporal Planning in Dynamic Environments for P-CLAIM Agents==
<pdf width="1500px">https://ceur-ws.org/Vol-494/ladspaper8.pdf</pdf>
<pre>
   Temporal Planning in Dynamic Environments for
                 P-CLAIM Agents
                      Muhammad Adnan Hashmi                                      Amal El Fallah Seghrouchni
                  Laboratoire d’Informatique de Paris 6                       Laboratoire d’Informatique de Paris 6
                    Université Pierre et Marie Curie                           Université Pierre et Marie Curie
                          75016 Paris, France                                         75016 Paris, France
                     Email: Adnan.Hashmi@lip6.fr                                 Email: Amal.Elfallah@lip6.fr


   Abstract—Time and uncertainty of the environment are very           dropped during the life cycle of the agent and some goals
important aspects in the development of real world applications.       require immediate achievement. In this work, we try to fill
Another important issue for the real world agents is, the balance      these gaps by incorporating a temporal planner, an executor,
between deliberation and reactivity. But most of the agent
oriented programming languages ignore some or all of these             an execution monitor and a plan repairing component to a
important aspects. In this paper we try to fill this gap by present-   CLAIM agent[12]. We call this extension of the language
ing an extension to the architecture of CLAIM agent oriented           as P-CLAIM. The main problems dealt with in this work
programming language to endow the agents with the planning             are 1) Modifications and extensions to the CLAIM agent’s
capability. We remove the assumption that agents’ actions are          architecture to include a temporal planning component. 2)
instantaneous. We are interested in the temporal planning of on
the fly goals. A coherrent framework is proposed in which agents       Execution monitoring and plan repairing. 3) Creating a balance
are able to generate, monitor and repair their temporal plans.         between deliberation and reactivity.
Our proposed framework creates a balance between reactivity               In our proposed framework, we have made use of Hier-
and deliberation. This work could be considered as a first step        archical Task Network (HTN) planning technique. The main
towards a complete temporal planning solution for an AOP               algorithm used to generate plan for a goal is JSHOP2[13],
language.
                                                                       which is very efficient HTN planning system and plans for
                                                                       tasks in the same order that they will later be executed. The
                       I. I NTRODUCTION
                                                                       main motivation behind using the HTN planning technique
   Most of the agent oriented programming languages in the             is the similarities among the BDI model of agency and
current literature use a PRS like approach to achieve the goals        the HTN planning technique[14]. Due to these similarities
of agent. Some examples of these programming languages                 HTN planning is more suitable and natural candidate for
are Jason[1], 3APL[2], 2APL[3] and JACK[4]. But these                  incorporating planning in a BDI like system.
languages lack the ability to incorporate planning. Sometimes             The remainder of this paper is organized as follows. Section
the execution of the actions without planning results in the           2 puts some light on the current architecture of CLAIM
unability to achieve the goals. There has been some work to            language and JSHOP2 planner. In section 3, some important
incorporate planning within such programming languages [5],            representations are presented which are helpful in under-
[6], [7] but these systems do not take into account the duration       standing the agent architecture in P-CLAIM. Our proposed
of agent actions, neither do they consider the uncertainty of the      architecture of P-CLAIM agent with planning, execution and
environment. These systems assume that the agents’ actions             plan repairing components is presented in section 4. In section
are instantaneous and that the effects produced on the environ-        5, we give an example to describe the working of system.
ment are only those which are produced by the agent’s actions.         Section 6 discussed some of the related work. Section 7
But these assumptions are unrealistic for the development of           concludes the paper and some future directions are discussed.
real world applications. There are some systems like ZENO[8],
TGP[9], SAPA[10] which give the ability to plan with durative                                II. BACKGROUND
tasks and even there are systems which give this ability in               In this section, we briefly discuss the architecture of CLAIM
the dynamic environments like IxTeT[11]. But these systems             language and JSHOP2 algorithm to generate a plan. A multi-
are separate planning solutions. They are not programming              agent system in CLAIM is a set of distributed hierarchies
languages, so they lack the flexibility and control that a             of agents deployed on computers connected via a network.
programming language offers to its programmers. Moreover,              All the computers have a global clock. With respect to the
these systems are built on a proactive approach but in the             hierarchical representation, an agent is a node in a hierarchy.
real world applications it is necessary to create a balance            It is an autonomous, intelligent and mobile entity. It has a
between proactivity and reactivity because it is a dynamic             parent and contains (optional) sub-agents, running processes
world and the goals of agents are not necessarily given to             and cognitive elements (e.g. knowledge, goals, capabilities).
him at the start, new goals arrive and some old goals are              An agent can dynamically create another agent, and the newly
created agent becomes the sub-agent of the creator agent.              divided into actions and activities. Actions are the primitive
In addition, an agent has three mobility primitives, in (enter         actions that an agent can perform. Some of the actions are
another agent), out (leave another agent) and move (move from          programmer defined while the others are already defined in
one hierarchy to another).                                             the language like mobility primitives in, out, move. Program-
   In CLAIM language, an agent can be defined as follows:              mer can also override the already defined actions to define
   defineAgent agentName {                                             his requirements more accurately. An action consists of a
       parent=null | agentName ;                                       condition, a triggering message, the effects and a duration.
       knowledge=null | { (knowledge;)*}                               T riggerM essage(Act) returns the triggering message of an
       goals=null | { (goal;)*}                                        action Act. Each effect of an action has an offset associated
       messages=null | { (message;)*}                                  with it. This offset is the time taken by the action to produce
       capabilities=null | { (capability;)*}                           the effect after the start of the action and it could be zero if
       processes=null | { (process | )*}                               this effect is achieved as soon as the action is started or it
       agents=null | { (agentName;)*}                                  could be greater than zero. Of f set(Ef f ) denotes the offset
    }                                                                  associated with an effect Ef f . Activities are the short plans
For a more detailed description of CLAIM language, we refer            (recipes) in the plan library of the agent to achieve different
to [12].                                                               composite goals.
   JSHOP2 is an HTN planning algorithm, and it deals with
                                                                       B. Goal Representation in P-CLAIM
the procedural goals. Domain description required by JSHOP2
consists of methods and operators. A method indicates how to              Goals in P-CLAIM are procedural goals. It means the goals
decompose a compound task into partially ordered subtasks. A           of an agent are the tasks that agent wants to achieve. Some
method has three parts. The task for which the method is to be         goals are initially given to the agent, when the multi-agent
used, the condition which must be true in the current state to         system is launched and some goals are given to the agent
apply the method, and subtasks that need to be accomplished            during the life of the agent using message passing by other
in order to accomplish that task. An operator is similar to the        agents or by user interaction. Goals have priorities associated
operators in classical planning and it tells how to perform a          with them. The priority of a goal could be Preemptive High,
primitive task. It has a condition, a list of add effects and a list   High or Normal. A goal having Preemptive High priority
of delete effects. Planning proceeds by using the methods to           means that this goal should be immediately achieved by the
decompose tasks recursively into smaller and smaller subtasks,         agent, we also call this goal a reactive goal. High priority
until the planner reaches primitive tasks that can be performed        means that goal should be achieved before all the Normal
directly using the planning operators.                                 priority goals currently present. Normal priority goals are the
   The rationale behind choosing JSHOP2 for our work is                lowest priority goals. Goals with Preemptive High priority are
threefold. Firstly, it is an HTN planner and the domain                stored in Global Reactive Goals (GRG) list and all other goals
information from CLAIM can be easily transformed into the              of agent are stored in a priority queue called Global Proactive
domain information needed by the planner due to the similar-           Goals (GPG) list.
ities among BDI like systems and HTN planning systems as
                                                                       C. Messages Format
discussed in [14]. Secondly, JSHOP2 plans for the actions in
the same order that they will later be executed. So it knows              A message received by an agent in P-CLAIM has five parts.
the current state at every planning step. This property of             First part is the identity. Each message is assigned a unique
the planner can be exploited for interleaving planning with            number as identity. Second part is the sender, which represents
execution and at every step planner can plan using the current         the sender of the message. Thirdly, a message has a priority
state of the world. Thirdly, it can call external user defined         associated with it. This field has a value among Preemptive
functions to check the precondition of a method or an operator         High, High and Normal. Fourthly, a message has a proposition
and this property is important for a planning component for            which is the actual contents of the message. This proposition
CLAIM agents because in CLAIM language there could be                  could be a new goal to achieve or it could be an information
calls to user defined functions to check the precondition of           given to the agent which was demanded by the agent in an
capabilities.                                                          earlier message. Finally, a message has a ResponseTo field
                                                                       which is either blank or it contains a number pointing to
         III. S OME IMPORTANT REPRESENTATIONS                          the identity of an earlier message to which this message is
   In this section some important representations are presented        responding.
which are helpful in understanding the architecture of an agent
                                                                       D. Translation of Domain Description
in P-CLAIM.
                                                                          The information needed by JSHOP2 algorithm to generate
A. Domain Representation in P-CLAIM                                    the plan includes the initial state information, goals informa-
  We have modified the domain representation in CLAIM                  tion and domain description (methods and operators). In our
[12], in order to facilitate the translation to the representation     formalism, this information is automatically extracted from the
needed by a planner. Agent’s capabilities have now been                agent. Initial state information is taken from the knowledge of
the agent and from the hierarchical representation of MAS.                         IV. AGENT A RCHITECTURE
Goal for the Planner is a one to one mapping from agent’s goal        There are concurrently four threads running inside the agent
to Planner’s goal. In our framework, only one goal is passed to    all the time. In the following subsections, we explain these
the JSHOP2 algorithm at a time. Agent’s actions are mapped         threads in detail. Figure 1 is showing the architecture of an
to the operators in JSHOP2. P-CLAIM agent’s activities are         agent.
converted into JSHOP2 methods. For each activity of the
agent, an equivalent method is generated with the same name        A. Messages Handler
as that of activity. Activity’s condition is mapped to the            This thread is always waiting for the messages from other
method’s precondition. In JSHOP2, methods have subtasks.           agents or from the user. It puts the messages into Planner
Subtasks may be primitive tasks or other composite tasks.          Messages Queue(PMQ). These messages are either a request
Equivalently in P-CLAIM, the body of an activity consists          to achieve some goal or they are responses to some earlier sent
of different processes. So we need to convert these processes      message. After putting in the PMQ, these messages are fetched
into JSHOP2 methods and operators. To read in detail about         and analyzed. If the message contains some information de-
this conversion, we refer to our earlier article[15].              manded in an earlier message then this information is added
                                                                   to the knowledge of the agent along with an acknowledgement
E. Policy File
                                                                   having the identity of the message in which this information
   Each agent maintains a policy file in which it stores the       was demanded. Agent’s treatment of a message, which is
importance of all other agents in the MAS for him. Importance      a request to achieve some goal, depends on the priority
of an agent depends on its position in the hierarchy relative      associated with message and the importance of sender.
to the position of the agent who is maintaining the policy.           The Messages Handler fetches the goal attached with a
Importance also depends on the services provided by the agent      message and assigns a priority to the goal based on the priority
during the life cycle of the agent. After receiving the message,   associated with message and the importance of sender. A goal
the agent analyzes the policy file. Importance of the agent        fetched from a message of priority Preemptive High or High
could be Normal or High.                                           which is assigned by an agent having Normal importance in
                                                                   the policy file is assigned a High priority. It means that agent
                                                                   does not preempt his own goals for the goals assigned by an
                                                                   agent of Normal importance. A goal fetched from a message,
                                                                   sent by an agent having High importance in the policy file is
                                                                   assigned the same priority as of the message. After assigning a
                                                                   priority to the goal, the goal is added to one of the two global
                                                                   goals lists. A goal of priority Preemptive High is added to
                                                                   GRG list and a goal of priority High or Normal is added to
                                                                   GPG list.
                                                                      There is another messages queue maintained inside the
                                                                   agent, called Executor Messages Queue(EMQ). Messages
                                                                   which are sent by the Planner for the execution of actions
                                                                   are put in the EMQ. These are the triggering messages for the
                                                                   actions in the plan generated by the Planner. Number of mes-
                                                                   sages in EMQ are denoted by length(EM Q) and EM Q[i]
                                                                   denotes the ith message in EMQ. Each triggering message in
                                                                   EMQ has a time stamp associated with it. T imeStamp(M sg)
                                                                   denotes the time stamp associated with a triggering message
                                                                   M sg.
                                                                   B. Planner
                                                                      Planner starts when the multi-agent system is launched.
                                                                   Once started, the Planner procedure runs throughout the life
                                                                   cycle of the agent. When there are no goals in either of the
                                                                   goals lists then it sits idle and waits for new goals to arrive
                                                                   and as soon as a new goal arrives, it starts planning. Before
                                                                   starting the Planner, the agent goes through an initialization
                                                                   phase, in which it sets the values of certain global variables.
                                                                      Three states of the world are maintained in the system,
                                                                   SP (Act) which denotes the state of the world anticipated by
                                                                   planner just before the execution of the action Act, secondly
                 Fig. 1.   A Running P-CLAIM Agent                 SW is the current actual state of the world and F inalSP
Algorithm 1 M ain Algorithm                                          Algorithm 3 Compute P lan(S, G, D)
 1: loop                                                              1: P ← The Empty Plan
 2:    repeat                                                         2: I ← S
 3:       Call T reat Reactive Goal                                   3: LG ← G
 4:    until GRG = φ                                                  4: LG0 ← {g ∈ LG : no goal is constrained to precede g}
 5:    if GP G 6= φ then                                              5: loop
 6:       Fetch first goal g ∈ GP G                                   6:    if LG = φ then
 7:       P P lan ← Compute P lan(F inalSP, g, D)                     7:        P lan ← Call T emporal Converter(I, P, D)
 8:       if P P lan 6= F ail then                                    8:        Return P lan
 9:          for i = 1 To length(P P lan) do                          9:    end if
10:             T imeStamp(ExeM essages[i])                      ←   10:    Non deterministically choose any g ∈ LG0
                T imeStamp(P P lan[i])                               11:    if g = Some P rimitive Action then
11:             ExeM essages[i] ← T riggerM essage(P P lan[i])       12:        if g = Inf ormation Gathering T ask then
12:          end for                                                 13:            Generate and send message with identity x, for information
13:          Send ExeM essages to EMQ                                               retrieval to other agent
14:       end if                                                     14:            Put all tasks depending on g in Pending Tasks list and assign
15:    end if                                                                       them an identity x
16: end loop                                                         15:            Remove g from LG
                                                                     16:        else
                                                                     17:            A ← {(a, Θ) : a is a ground instance of an operator in D, Θ
                                                                                    is a substitution that unifies {head(a), g}, and S satisfies a’s
Algorithm 2 T reat Reactive Goal                                                    preconditions}
 1: Fetch first goal g ∈ GRG                                         18:            if A = φ then
 2: Suspension Signal ← ON                                           19:                Return F ail
 3: Wait until {Execution Signal = OF F }                            20:            else
 4: Start T ime ← Current system time                                21:                Non deterministically choose a pair (a, Θ) ∈ A
 5: RP lan ← Compute P lan(SW, g, D)                                 22:                S ← S + Add(a) − Del(a)
 6: if RP lan 6= F ail then                                          23:                Append a to P
 7:    for i = 1 To length(RP lan) do                                24:                Modify LG by removing g and applying Θ
 8:       T imeStamp(ExeM essages[i]) ← T imeStamp(RP lan[i])        25:            end if
 9:       ExeM essages[i] ← T riggerM essage(RP lan[i])              26:        end if
10:    end for                                                       27:        LG0 ← {g ∈ LG : no other goal is constrained to precede g}
11:    End T ime ← Current system time                               28:    else
12:    Duration ← End T ime - Start T ime                            29:        M ← {(m, Θ) : m is an instance of a method in D, Θ unifies
13:    for i = 1 To length(EM Q) do                                             {head(m), g}, pre(m) is T rue in S, and m and Θ are as general
14:       T imeStamp(EM Q[i])       ← T imeStamp(EM Q[i]) +                     as possible}
          Duration                                                   30:        if M = φ then
15:    end for                                                       31:            Return F ail
16:    Send ExeM essages to EMQ                                      32:        end if
17: end if                                                           33:        Non deterministically choose pair (m, Θ) ∈ M
18: Suspension Signal ← OF F                                         34:        Modify LG by removing g, adding sub(m), constraining each
                                                                                goal in sub(m) to precede the goals that g preceded, and applying
                                                                                Θ
                                                                     35:        if sub(m) 6= φ then
                                                                     36:            LG0 ← {g ∈ sub(m) : no goal in LG precedes g}
denotes the state of the world to which the Planner has              37:        else
planned till now. More precisely, it is the state of the world       38:            LG0 ← {g ∈ LG : no goal in LG precedes g}
                                                                     39:        end if
anticipated by planner after the very last action that the Planner   40:    end if
has planned for. In the initialization phase F inalSP is set         41:    if New acknowledgement in knowledge then
equal to the SW . Suspension Signal is set to OF F and               42:        id ← identity of the message whose acknowledgement has arrived
                                                                     43:        Fetch all goals associated with message id from Pending Tasks
Execution Signal is set to ON .                                                 list and put in the LG
   The M ain Algorithm (Algorithm 1) runs in an infinite             44:    end if
                                                                     45:    if G is a proactive goal then
loop and ensures that reactive goals are immediately planned         46:        repeat
for and achieved. First it looks at the GRG list and if it is        47:            Call T reat Reactive Goal
not empty, (Lines 2-4) the control moves to the procedure            48:        until GRG = φ
                                                                     49:    end if
T reat Reactive Goal (Algorithm 2). Some of the notations
                                                                     50: end loop
used inside the T reat Reactive Goal procedure are as fol-
lows. length(RP lan) denotes the number of actions in the
plan RP lan. ExeM essages is an array of triggering mes-
sages for the actions in the plan. T imeStamp(Act) denotes           state of the world SW , the reactive goal just fetched g
the time stamp assigned to an action Act for its execution.          and domain description D are passed to Compute P lan
T reat Reactive Goal fetches the first reactive goal g and           procedure. This procedure call returns a temporal plan RP lan
sets the Suspension Signal to ON to ask the Executor to              for the reactive goal. Because every action in P-CLAIM is ex-
suspend the execution and waits for the Execution Signal             ecuted using a triggering message, so an array ExeM essages
to go OF F which indicates that the Executor has suspended           is generated containing the triggering messages for all the
the execution (Lines 1-3) then it calls the Compute P lan            actions in the temporal plan RP lan with a T imeStamp
procedure to plan for the reactive goal (Line 5). The current        associated with every message (Lines 7-10) and this array of
Algorithm 4 T emporal Converter(I, P, D)                                   goals (Line 29) and applies this method (Lines 33-34). If no
 1: for j = 1 TO no of literals(I) do                                      such method m exists then procedure returns failure (Lines
 2:    P roduction T ime(Literal(I[j])) ← 0                                30-32).
 3: end for
 4: for i = 1 TO length(P ) do                                                At the end of each planning step, the Compute P lan
 5:    T imeStamp(P [i]) ← M ax {P roduction T ime(P re(P [i][j]))         procedure looks for any newly arrived acknowledgement for
       :j = 1 To no of pre(P [i])}                                         an earlier sent message. If a new acknowledgement for a
 6:    P rereq(P [i]) ← Actions which achieve the preconditions of P [i]
 7:    SP (P [i]) ←World state anticipated before the execution of P [i]   message with identity id has been arrived then the procedure
 8:    for j = 1 TO no of ef f ects(P [i]) do                              removes all the tasks depending on id, from Pending Tasks list
 9:       P roduction T ime(Literal(P [i][j])) ← T imeStamp(P [i])         and puts them in the Local Goals list to process those goals
          + Of f set(Literal(P [i][j]))
10:    end for                                                             (Lines 41-44).
11: end for                                                                   While planning for a proactive goal, the Compute P lan
                                                                           procedure checks GRG for any new goals after each plan-
                                                                           ning step and whenever it finds a goal in GRG, it sus-
messages is sent to EMQ (Line 17) from where the Executor                  pends planning for the proactive goal and calls the procedure
executes the actions triggered by these messages. But before               T reat Reactive Goal , which we have explained earlier
sending ExeM essages to EMQ, the T imeStamp of all the                     (Lines 45-49). When GRG becomes empty, procedure resumes
messages currently in the EMQ is updated, because due to                   planning for the proactive goal from the same state at which
the suspension of execution, those triggering messages can                 it had suspended the planning. While planning for a reactive
not be executed at their intended time. So every message’s                 goal, the Compute P lan procedure does not look at GRG,
T imeStamp is increased by the duration of the suspension                  because a new reactive goal is treated only when all the
(Lines 13-15). Suspension Signal is then set to OF F (Line                 previous reactive goals have been treated.
18) to allow the Executor to resume execution and control is                  When Compute P lan finds a plan for one goal, it converts
passed back to M ain Algorithm (Algorithm 1) which looks                   the total order plan into a temporal plan by calling the pro-
for another goal in GRG. The M ain Algorithm turns its                     cedure T emporal Converter (Algorithm 4). The procedure
attention to the proactive goals only when it finds that there is          takes three parameters I, P and D, where I is the initial
no reactive goal (Line 5). Algorithm fetches the first goal from           state, P is the total order plan which is to be converted and
GPG (Line 6). High priority goals are always fetched before                D is the domain description file which is needed to extract
Normal priority goals. Then Compute P lan procedure is                     the information about the durations of all the actions and
called with the parameters F inalSP , g and D. A plan P P lan              offsets of all the effects. Some notations used in the procedure
is returned (Line 7 ) which is then sent to EMQ in the form                are as follows. no of literals(I) denotes the number of
of triggering messages (Lines 9-13). Now we elaborate the                  literals in the initial state and Literal(I[j]) points to the
working of Compute P lan procedure (Algorithm 3) (Many                     j th literal in initial state. P roduction T ime(Lit) represents
lines of the algorithm are taken from [13]).                               the time of achievement of a literal Lit. length(P ) returns
   Compute P lan procedure is an extension of JSHOP2[13]                   the number of actions in the plan P . no of pre(Act) and
algorithm. It takes three parameters S, G and D as input,                  no of ef f ects(Act) denote the number of preconditions and
where S is initial state, G is a list of goals and D is the agent’s        number of effects of an action Act respectively while in the
domain description. Compute P lan procedure has an internal                same vein P re(P [i][j]) and Literal(P [i][j]) denote the j th
goals list called Local Goals (LG) list . Algorithm chooses a              precondition and j th effect of ith action in plan P respectively.
goal g ∈ LG which has no predecessors (Line 4). At this point              We have used a simple and efficient technique to convert a
there could be two cases. The first case is if g is a primitive            plan into temporal plan. The procedure starts by setting the
task, then procedure finds an operator a that matches g and                P roduction T ime of all the literals in the initial state to
whose preconditions are satisfied in S. It applies the action a            0 (Lines 1-3). Then procedure loops through all the actions
to state S and adds it to his plan P (Lines 17,21-23). If no               starting from the first action, going towards the last one and
such operator a exists then procedure returns failure (Lines               sets the T imeStamp of the action to the maximum of the
18-19). In P-CLAIM a message to other agent is also treated                P roduction T ime of all its preconditions, because an action
as primitive action. So, g could be a message to other agent               can be executed at least when all of its preconditions have
for information retrieval. If this is the case, then a message for         been achieved (Lines 4-5). After setting the T imeStamp of
the retrieval of information is generated with identity x and              an action, the procedure sets the P roduction T ime of all the
is sent to other agent. And all the tasks which depend on this             effects of the action. The production time of an effect is the
information are put in the Pending Tasks list (Lines 11-15).               T imeStamp of the action plus the time at which the effect is
All these tasks are assigned same identity x as of the message             produced by the action, the Of f set of the effect (Lines 8-10).
before sending them to Pending Tasks list.
   The second case is where g is a compound goal, so a method              C. Executor
needs to be applied for the decomposition of g into its sub-                  The Executor is running in parallel with the Planner. It
tasks. In this case the planner nondeterministically chooses a             waits for triggering messages to come in the EMQ, fetches
method instance m matching g, that decomposes g into sub-                  the messages and executes the actions associated with the
Algorithm 5 Executor                                                       Moreover, after executing each action, the Executor checks
 1: loop                                                                   the Suspension Signal. When Suspension Signal is set
 2:    if Suspension Signal = ON then
 3:       Execution Signal ← OF F                                          to ON , it turns Execution Signl to OF F , suspends the ex-
 4:       Wait until {Suspension Signal = OF F }                           ecution, and waits for Suspension Signal to go OF F . The
 5:       Execution Signal ← ON                                            Executor resumes the execution once the Suspension Signal
 6:    end if
 7:    if EM Q 6= φ then                                                   is turned to OF F . But now the triggering messages for the
 8:       N extActions ← Fetch all next messages C from EMQ having         plan of reactive goal are at the front of EMQ, so the Executor
          the earliest T imeStamp from current system time                 first executes the plan for the reactive goal for which it had
 9:       N extT ime ← T imeStamp(N extActions)
10:       Wait for system time to reach N extT ime                         suspended the execution and then it resumes the execution of
11:       for i = 1 TO length(N extActions) do                             plan on which it was working before the suspension (Line 29).
12:          if All the actions in P rereq(N extActions[i]) has not sent
             acknowledgement for termination then
13:             Wait for all the acknowledgements                          Algorithm 6 P lan M ender(I, G)
14:             Duration ← Time spent waiting for acknowledgements          1: Generate a plan P using SATPLAN from I to G ignoring the duration
15:             for i = 1 To length(EM Q) do                                   of actions
16:                 T imeStamp(EM Q[i]) ← T imeStamp(EM Q[i]) +             2: T P ← Call T emporal Converter(I, P, D)
                    Duration                                                3: Return T P
17:             end for
18:          end if
19:          if SP (N extActions[i]) = SW then
20:             Execute N extActions[i] in a separate thread               D. Plan Mender
21:          else
22:             M P lan ← P lan M ender(SW, SP (N extActions[i]))
                                                                              This procedure is responsible for repairing the plan. It takes
23:             Execute M P lan                                            as input the current actual world state I and the anticipated
24:             for i = 1 To length(EM Q) do                               world state G. The Plan Mender generates a temporal plan
25:                 T imeStamp(EM Q[i]) ← T imeStamp(EM Q[i]) +
                    T imeSpan(M P lan)
                                                                           for the agent to reach the anticipated world state starting
26:             end for                                                    from the current world state and returns this plan to the
27:             Execute N extActions[i] in a separate thread               Executor. The Plan mender uses the classical STRIPS style
28:          end if
29:       end for
                                                                           planning technique to compute its plan because now the goals
30:    end if                                                              for the planner are a state to be reached (declarative goal).
31: end loop                                                               So, the Plan Mender just uses operators from the domain
                                                                           description file to compute the plan. In this case, the activities
                                                                           are not helpful in generating the plan which were used by
                                                                           the Planner component. The basic algorithm used by the Plan
messages at their planned time stamps. Every running ac-
                                                                           Mender is shown in Algorithm 6. Plan mender computes a
tion sends an acknowledgement just before its termination
                                                                           plan without taking into account the durations of the actions
to the Executor. Algorithm 5 is a simplified version of
                                                                           using the SATPLAN planner[16] and then uses the procedure
the Executor. The Executor fetches all the next messages
                                                                           T emporal Converter to convert the plan to a temporal plan.
from EMQ that have the closest T imeStamp to the current
system time. Then the Executor waits for the system time
to reach the T imeStamp of these messages (Lines 10-11).
When system time approaches that time, the Executor checks
whether the prerequisite actions of the actions associated
with these messages have been terminated or not. If they
have not been terminated then it waits for their termination.
And increases the T imeStamp of all the messages in EMQ
by the duration of waiting for their termination (Lines 14-
20). Then it checks for any discrepency among the current
world state and the one anticipated by the Planner for the
execution of these actions. If there is no discrepency then                      Fig. 2.   (a).ROCO Activities   (b).Plan for Clean T able
these actions are executed in a separate thread (Lines 21-22)
and the Executor fetches the next messages from EMQ. But
if there is discrepency among the two world states, then the                                           V. E XAMPLE
Executor calls the Plan Mender to generate a plan from the                    In this example scenario we have one mobile agent ROCO,
current world state to the intended world state and executes               which is a home servant. When the MAS is launched then
the plan thus returned to remove the discrepency (Lines 24-                ROCO has the goal Clean T able. ROCO has activity as-
25). After executing this plan the Executor is ready to execute            sociated with this goal Clean T able. All the activities of
the actions which it had suspended due to discrepency (Line                ROCO are shown in tree form in Figure 2(a) (A rectangle
29). But before executing these actions, it augments their                 is showing a goal and an associated oval is an Activity
T imeStamp by the duration of the discrepency removal.                     associated with the goal, rectangles at the bottom without an
associated oval are Actions.). M ain Algorithm fetches goal                After the execution of this plan ROCO is in OwnerRoom.
Clean T able from GPG list. And calls Compute P lan to                     Now the Executor resumes its suspended plan but before
plan for this goal which generates a plan consisting of the                resuming the suspended plan, it increases the T imeStamp
following actions M ove(Room1, T able), ArrangeBooks,                      of all the actions in the suspended plan by the T imeSpan
ArrangeCover, Dusting. A short description of the actions                  of the plan for goal Bring W ater, then it checks whether
in this plan is shown in Figure 3.                                         the preconditions of the suspended plan hold in the
                                                                           current state. The preconditions of its suspended plan are
                                                                           At(ROCO, T able)∧Arranged(Books)∧Arranged(Cover)
                                                                           and     the   current    state   is    Arranged(Books) ∧
                                                                           Arranged(Cover) ∧ At(ROCO, OwnerRoom).                  The
                                                                           Executor calls Plan Mender to generate a plan from current
                                                                           state of the world to the intended state of the world.
                                                                           Plan Mender returns a plan consisting of only one action
                                                                           M ove(OwnerRoom, T able). Executor executes this plan, so
  Fig. 3.   Description of actions in the temporal plan for Clean T able   ROCO moves to Table. Now again the Executor checks for
                                                                           any discrepency among the current state and the anticipated
   The plan is converted to the temporal plan using the                    state but now both states are same so the Executor executes
procedure T emporal Converter and the plan returned is                     the suspended plan i.e. it executes the Dusting action.
shown in figure 2(b). In this example, all the effects of
all the actions have an offset equal to the duration of the                                   VI. R ELATED W ORK
action. Here we explain the conversion of totally ordered                     In this section, we briefly review some work from the
plan to temporal plan. Procedure starts by assigning the                   existing literature which is related to our work. Some of
P roduction T ime of all the literals in the initial state to 0.           the research related to ours is CYPRESS[17], RETSINA[18],
There is only one literal At(ROCO, Room1) in the initial                   DECAF[19] and the systems proposed in [5] and [6].
state so P roduction T ime(At(ROCO, Room1)) is set to 0.                      In our opinion, the system closest to our research is CY-
Now the procedure takes first action M ove(Room1, T able)                  PRESS system, which also integrates a planning system SIPE-
and sets its T imeStamp to 0, which is the maximum                         2[20] with an execution system PRS[21]. It has the ability
P roduction T ime from all of its preconditions. Next,                     to react to the unanticipated changes in the environment by
the procedure sets the P roduction T ime of the effects                    replanning and also deals with probabilistic planning. Our
of M ove(Room1, T able). This action has only one effect                   approach has the added advantage of handling temporal knowl-
At(ROCO, T able). P roduction T ime(At(ROCO, T able))                      edge. Another aspect differentiating P-CLAIM to CYPRESS
is assigned the value T imeStamp(M ove(Room1)) plus                        is the mobility of the agents. In P-CLAIM, the agents are
Of f set(At(ROCO, T able)). Putting the values, we                         mobile so the context of an agent changes while moving from
get P roduction T ime(At(ROCO, T able)) equals 1                           one machine to another. The planner component must be able
minute, because Of f set(At(ROCO, T able)) is equal to                     to deal with the changing context because the planning is
the duration of M ove(Room1, T able). Now procedure                        interleaved with execution. An advantage of CYPRESS system
moves to second goal which is Arrange Books                                over our proposed system is in the way CYPRESS performs
and sets its T imeStamp to be the maximum of                               replanning. We suspend the execution while computing a plan
P roduction T ime of all of its preconditions. It                          to remove any discrepencies. While CYPRESS system uses
has only one precondition At(ROCO, T able) whose                           asynchronous replanning in which the system continues to
P roduction T ime has already been calculated to 1 minute.                 execute the unaffected portion of the plan while a planning
So T imeStamp(Arrange Books) is assigned 1 munite. In                      module computes a new plan.
this way the procedure continues and finds the plan shown in                  Our system has many similarities with RETSINA. Like our
figure 2(b). Planner sends the messages for each action of the             system, RETSINA also interleaves planning with execution
plan along with their T imeStamp to the EMQ for execution                  and supports planning for dynamic and real environments.
and Executor starts executing the plan. When the Executor                  But one main difference of RETSINA system with our system
has executed M ove(Room1, T able), ArrangeBooks                            is that RESTINA system plans by only reduction of the top
and ArrangeCover, it checks that Suspension Signal                         level task and it does not plan among the top level tasks, but
is set to ON , because the Planner has just fetched a                      our system uses a HTN planner which also plans among the
reactive goal Bring W ater from the GRG. The Executor                      top level tasks. So the plan generated is more optimal in our
suspends the execution, sets the Execution Signal                          system than in RETSINA system. Another main difference is
to OF F and waits for the Suspension Signal to                             that RETSINA system does not use the existing information
go to OF F again. It receives the following plan                           from the BDI system whereas our system proposes a method
in the EMQ, M ove(T able, Kitchen), T akeGlass,                            to use the existing agent’s and world’s information.
F illGlassW ithW ater,       M ove(Kitchen, OwnerRoom),                       Another framework DECAF[19] which can be seen as an in-
Give(Glass, Owner). Now the Executor executes this plan.                   spiration of RETSINA, relates to our system. But, in DECAF,
the planner only estimates preconditions, select task templates                  [4] P. Busetta, R. Ronnquist, A. Hodgson, and A. Lucas, “Jack intelligent
and instantiates them. It lacks the ability to anticipate future                     agents-components for intelligent agents in java,” AgentLink News
                                                                                     Letter, vol. 2, pp. 2–5, 1999.
actions.                                                                         [5] L. de Silva and L. Padgham, “Planning on demand in BDI systems,”
   Like our system, [5] also provides a way to translate the                         Proc. of ICAPS-05 (Poster), 2005.
information from a JACK[4] agent to the information needed                       [6] F. Meneguzzi, A. Zorzo, and M. da Costa Mora, “Propositional planning
                                                                                     in BDI agents,” in Proceedings of the 2004 ACM symposium on Applied
by JSHOP[22] planner. Main differences of this approach with                         computing. ACM New York, NY, USA, 2004, pp. 58–63.
our approach are that in [5] it is the responsibility of the                     [7] de Silva et al., “First Principles Planning in BDI Systems,” in Proceed-
programmer to specify the points where the planner should                            ings of the 8th international joint conference on Autonomous agents and
                                                                                     multiagent systems, S. Decker, Sichman and C. (.eds), Eds., 2009, pp.
be called while our system plans for each goal. Our system                           1105–1112.
has the ability to deal with the unanticipated changes in the                    [8] J. Penberthy and D. Weld, “Temporal planning with continuous change,”
environment, while [5] has no such ability.                                          in Proceedings of the national conference on Artificial Intelligence.
                                                                                     John Wiley & Sons Ltd., 1995, pp. 1010–1010.
   Another framework incorporating planning in a BDI lan-                        [9] D. Smith and D. Weld, “Temporal planning with mutual exclusion
guage is presented in [6]. It incorporates classical planning                        reasoning,” in International joint conference on artificial intelligence,
into BDI framework. More precisely it extends the X-BDI[23]                          vol. 16. Lawrence Erlbaum Associates Ltd., 1999, pp. 326–337.
                                                                                [10] M. Do and S. Kambhampati, “Sapa: A domain-independent heuristic
model to use the propositional planning algorithms for per-                          metric temporal planner,” in Proceedings of ECP-01, 2001, pp. 109–
formaing means-end reasoning. Our hypothesis is that our                             120.
proposed system has the advantage of being more efficient                       [11] M. Ghallab and H. Laruelle, “Representation and control in IxTeT, a
                                                                                     temporal planner,” in Proc. 2nd Int. Conf. on AI Planning Systems, 1994,
as the HTN planning technique can find plans more rapidly                            pp. 61–67.
with the help of additional domain knowledge provided by                        [12] A. El Fallah-Seghrouchni and A. Suna, “An unified framework for
the programmer. Another important aspect is the loss of the                          programming autonomous, intelligent and mobile agents,” Lecture notes
                                                                                     in computer science, Springer, pp. 353–362, 2003.
domain knowledge provided by the programmer in [6]. The                         [13] D. Nau, T. Au, O. Ilghami, U. Kuter, J. Murdock, D. Wu, and F. Yaman,
advantage of using the HTN planning is that the plans can                            “SHOP2: An HTN planning system,” Journal of Artificial Intelligence
be synthesized according to the intentions of the programmer                         Research, vol. 20, no. 1, pp. 379–404, 2003.
                                                                                [14] S. Sardina and L. Padgham, “Hierarchical planning in BDI agent
without loosing the domain knowledge.                                                programming languages: A formal approach,” in Proceedings of the fifth
                                                                                     international joint conference on Autonomous agents and multiagent
            VII. C ONCLUSION A ND F UTURE W ORK                                      systems. ACM New York, NY, USA, 2006, pp. 1001–1008.
                                                                                [15] H. M. Adnan, “A Planning Component for CLAIM Agents,” in To
   In this paper, we have presented an extension to the CLAIM                        appear in the Proceedings of International Workshop On Multi-Agent
                                                                                     Systems Technology and Semantics. IEEE Romania, 2009.
language to endow the agents with the capability to plan ahead.                 [16] H. Kautz, B. Selman, and J. Hoffmann, “Satplan: Planning as satisfia-
This modified and extended language is called P-CLAIM.                               bility,” in 5th International Planning Competition. Citeseer, 2006.
Agents are able to create temporal plans. Execution monitoring                  [17] D. Myers, L. Wesley, and A. Center, “CYPRESS: Reacting and Planning
                                                                                     under Uncertainty,” in DARPA Proceedings: Rome Laboratory Planning
and plan repairing components are added. A balance between                           Initiative. Morgan Kaufmann, 1994, p. 111.
deliberation and reactivity has been established and the agents                 [18] M. Paolucci, O. Shehory, K. Sycara, D. Kalp, and A. Pannu, “A planning
are able to turn their attention while planning to the newly                         component for RETSINA agents,” Lecture notes in computer science,
                                                                                     Springer, pp. 147–161, 2000.
arrived reactive goals. This work can be considered as a first                  [19] J. Graham and K. Decker, “Towards a distributed, environment-centered
step towards a comprehensive temporal planning solution for                          agent framework,” Lecture notes in computer science, Springer, pp. 290–
an Agent Oriented Programming language.                                              304, 2000.
                                                                                [20] D. Wilkins, “Can AI planners solve practical problems?” Computational
   After creating the temporal plan for an agent but before its                      Intelligence, vol. 6, no. 4, pp. 232–246, 1990.
execution, the plan of an agent should be coordinated with                      [21] M. Georgeff and A. Lansky, “Procedural knowledge,” Proceedings of
the plans of those agents with which the plan could be in                            the IEEE, Special Issue on Knowledge Representation, vol. 74, no. 10,
                                                                                     pp. 1383–1398, 1986.
conflict or whose plans could be helpful for this agent. Our                    [22] D. Nau, Y. Cao, A. Lotem, and H. Muñoz-Avila, “SHOP: Simple hier-
next task is to propose a coordination mechanism to coordinate                       archical ordered planner,” in Proceedings of the Sixteenth International
the temporal plans of different agents. Coordinating the plan                        Joint Conference on Artificial Intelligence table of contents. Morgan
                                                                                     Kaufmann Publishers Inc. San Francisco, CA, USA, 1999, pp. 968–975.
of agent with every other agent in the MAS is very costly, so                   [23] M. Mora, J. Lopes, R. Viccari, and H. Coelho, “BDI models and systems:
another important task to do is to intelligently calculate the                       Reducing the gap,” in Proc. of ATAL-98, LNCS, vol. 1555. Springer.
set of those agents whose plan could be in conflict or whose
plans could be helpful for the agent and then the plan should
be coordinated with only those agents.

                             R EFERENCES
 [1] R. Bordini, J. Hubner, and R. Vieira, “Jason and the Golden Fleece of
     agent-oriented programming,” Multiagent systems artificial societies and
     simulated organizations, vol. 15, p. 3, 2005.
 [2] M. Dastani, M. van Riemsdijk, and J. Meyer, “Programming multi-agent
     systems in 3APL,” Multiagent systems artificial societies and simulated
     organizations, vol. 15, p. 39, 2005.
 [3] M. Dastani and J. Meyer, “A practical agent programming language,”
     Lecture Notes in Computer Science, vol. 4908, p. 107, 2008.

</pre>