Temporal Planning in Dynamic Environments for P-CLAIM Agents Muhammad Adnan Hashmi Amal El Fallah Seghrouchni Laboratoire d’Informatique de Paris 6 Laboratoire d’Informatique de Paris 6 Université Pierre et Marie Curie Université Pierre et Marie Curie 75016 Paris, France 75016 Paris, France Email: Adnan.Hashmi@lip6.fr Email: Amal.Elfallah@lip6.fr Abstract—Time and uncertainty of the environment are very dropped during the life cycle of the agent and some goals important aspects in the development of real world applications. require immediate achievement. In this work, we try to fill Another important issue for the real world agents is, the balance these gaps by incorporating a temporal planner, an executor, between deliberation and reactivity. But most of the agent oriented programming languages ignore some or all of these an execution monitor and a plan repairing component to a important aspects. In this paper we try to fill this gap by present- CLAIM agent[12]. We call this extension of the language ing an extension to the architecture of CLAIM agent oriented as P-CLAIM. The main problems dealt with in this work programming language to endow the agents with the planning are 1) Modifications and extensions to the CLAIM agent’s capability. We remove the assumption that agents’ actions are architecture to include a temporal planning component. 2) instantaneous. We are interested in the temporal planning of on the fly goals. A coherrent framework is proposed in which agents Execution monitoring and plan repairing. 3) Creating a balance are able to generate, monitor and repair their temporal plans. between deliberation and reactivity. Our proposed framework creates a balance between reactivity In our proposed framework, we have made use of Hier- and deliberation. This work could be considered as a first step archical Task Network (HTN) planning technique. The main towards a complete temporal planning solution for an AOP algorithm used to generate plan for a goal is JSHOP2[13], language. which is very efficient HTN planning system and plans for tasks in the same order that they will later be executed. The I. I NTRODUCTION main motivation behind using the HTN planning technique Most of the agent oriented programming languages in the is the similarities among the BDI model of agency and current literature use a PRS like approach to achieve the goals the HTN planning technique[14]. Due to these similarities of agent. Some examples of these programming languages HTN planning is more suitable and natural candidate for are Jason[1], 3APL[2], 2APL[3] and JACK[4]. But these incorporating planning in a BDI like system. languages lack the ability to incorporate planning. Sometimes The remainder of this paper is organized as follows. Section the execution of the actions without planning results in the 2 puts some light on the current architecture of CLAIM unability to achieve the goals. There has been some work to language and JSHOP2 planner. In section 3, some important incorporate planning within such programming languages [5], representations are presented which are helpful in under- [6], [7] but these systems do not take into account the duration standing the agent architecture in P-CLAIM. Our proposed of agent actions, neither do they consider the uncertainty of the architecture of P-CLAIM agent with planning, execution and environment. These systems assume that the agents’ actions plan repairing components is presented in section 4. In section are instantaneous and that the effects produced on the environ- 5, we give an example to describe the working of system. ment are only those which are produced by the agent’s actions. Section 6 discussed some of the related work. Section 7 But these assumptions are unrealistic for the development of concludes the paper and some future directions are discussed. real world applications. There are some systems like ZENO[8], TGP[9], SAPA[10] which give the ability to plan with durative II. BACKGROUND tasks and even there are systems which give this ability in In this section, we briefly discuss the architecture of CLAIM the dynamic environments like IxTeT[11]. But these systems language and JSHOP2 algorithm to generate a plan. A multi- are separate planning solutions. They are not programming agent system in CLAIM is a set of distributed hierarchies languages, so they lack the flexibility and control that a of agents deployed on computers connected via a network. programming language offers to its programmers. Moreover, All the computers have a global clock. With respect to the these systems are built on a proactive approach but in the hierarchical representation, an agent is a node in a hierarchy. real world applications it is necessary to create a balance It is an autonomous, intelligent and mobile entity. It has a between proactivity and reactivity because it is a dynamic parent and contains (optional) sub-agents, running processes world and the goals of agents are not necessarily given to and cognitive elements (e.g. knowledge, goals, capabilities). him at the start, new goals arrive and some old goals are An agent can dynamically create another agent, and the newly created agent becomes the sub-agent of the creator agent. divided into actions and activities. Actions are the primitive In addition, an agent has three mobility primitives, in (enter actions that an agent can perform. Some of the actions are another agent), out (leave another agent) and move (move from programmer defined while the others are already defined in one hierarchy to another). the language like mobility primitives in, out, move. Program- In CLAIM language, an agent can be defined as follows: mer can also override the already defined actions to define defineAgent agentName { his requirements more accurately. An action consists of a parent=null | agentName ; condition, a triggering message, the effects and a duration. knowledge=null | { (knowledge;)*} T riggerM essage(Act) returns the triggering message of an goals=null | { (goal;)*} action Act. Each effect of an action has an offset associated messages=null | { (message;)*} with it. This offset is the time taken by the action to produce capabilities=null | { (capability;)*} the effect after the start of the action and it could be zero if processes=null | { (process | )*} this effect is achieved as soon as the action is started or it agents=null | { (agentName;)*} could be greater than zero. Of f set(Ef f ) denotes the offset } associated with an effect Ef f . Activities are the short plans For a more detailed description of CLAIM language, we refer (recipes) in the plan library of the agent to achieve different to [12]. composite goals. JSHOP2 is an HTN planning algorithm, and it deals with B. Goal Representation in P-CLAIM the procedural goals. Domain description required by JSHOP2 consists of methods and operators. A method indicates how to Goals in P-CLAIM are procedural goals. It means the goals decompose a compound task into partially ordered subtasks. A of an agent are the tasks that agent wants to achieve. Some method has three parts. The task for which the method is to be goals are initially given to the agent, when the multi-agent used, the condition which must be true in the current state to system is launched and some goals are given to the agent apply the method, and subtasks that need to be accomplished during the life of the agent using message passing by other in order to accomplish that task. An operator is similar to the agents or by user interaction. Goals have priorities associated operators in classical planning and it tells how to perform a with them. The priority of a goal could be Preemptive High, primitive task. It has a condition, a list of add effects and a list High or Normal. A goal having Preemptive High priority of delete effects. Planning proceeds by using the methods to means that this goal should be immediately achieved by the decompose tasks recursively into smaller and smaller subtasks, agent, we also call this goal a reactive goal. High priority until the planner reaches primitive tasks that can be performed means that goal should be achieved before all the Normal directly using the planning operators. priority goals currently present. Normal priority goals are the The rationale behind choosing JSHOP2 for our work is lowest priority goals. Goals with Preemptive High priority are threefold. Firstly, it is an HTN planner and the domain stored in Global Reactive Goals (GRG) list and all other goals information from CLAIM can be easily transformed into the of agent are stored in a priority queue called Global Proactive domain information needed by the planner due to the similar- Goals (GPG) list. ities among BDI like systems and HTN planning systems as C. Messages Format discussed in [14]. Secondly, JSHOP2 plans for the actions in the same order that they will later be executed. So it knows A message received by an agent in P-CLAIM has five parts. the current state at every planning step. This property of First part is the identity. Each message is assigned a unique the planner can be exploited for interleaving planning with number as identity. Second part is the sender, which represents execution and at every step planner can plan using the current the sender of the message. Thirdly, a message has a priority state of the world. Thirdly, it can call external user defined associated with it. This field has a value among Preemptive functions to check the precondition of a method or an operator High, High and Normal. Fourthly, a message has a proposition and this property is important for a planning component for which is the actual contents of the message. This proposition CLAIM agents because in CLAIM language there could be could be a new goal to achieve or it could be an information calls to user defined functions to check the precondition of given to the agent which was demanded by the agent in an capabilities. earlier message. Finally, a message has a ResponseTo field which is either blank or it contains a number pointing to III. S OME IMPORTANT REPRESENTATIONS the identity of an earlier message to which this message is In this section some important representations are presented responding. which are helpful in understanding the architecture of an agent D. Translation of Domain Description in P-CLAIM. The information needed by JSHOP2 algorithm to generate A. Domain Representation in P-CLAIM the plan includes the initial state information, goals informa- We have modified the domain representation in CLAIM tion and domain description (methods and operators). In our [12], in order to facilitate the translation to the representation formalism, this information is automatically extracted from the needed by a planner. Agent’s capabilities have now been agent. Initial state information is taken from the knowledge of the agent and from the hierarchical representation of MAS. IV. AGENT A RCHITECTURE Goal for the Planner is a one to one mapping from agent’s goal There are concurrently four threads running inside the agent to Planner’s goal. In our framework, only one goal is passed to all the time. In the following subsections, we explain these the JSHOP2 algorithm at a time. Agent’s actions are mapped threads in detail. Figure 1 is showing the architecture of an to the operators in JSHOP2. P-CLAIM agent’s activities are agent. converted into JSHOP2 methods. For each activity of the agent, an equivalent method is generated with the same name A. Messages Handler as that of activity. Activity’s condition is mapped to the This thread is always waiting for the messages from other method’s precondition. In JSHOP2, methods have subtasks. agents or from the user. It puts the messages into Planner Subtasks may be primitive tasks or other composite tasks. Messages Queue(PMQ). These messages are either a request Equivalently in P-CLAIM, the body of an activity consists to achieve some goal or they are responses to some earlier sent of different processes. So we need to convert these processes message. After putting in the PMQ, these messages are fetched into JSHOP2 methods and operators. To read in detail about and analyzed. If the message contains some information de- this conversion, we refer to our earlier article[15]. manded in an earlier message then this information is added to the knowledge of the agent along with an acknowledgement E. Policy File having the identity of the message in which this information Each agent maintains a policy file in which it stores the was demanded. Agent’s treatment of a message, which is importance of all other agents in the MAS for him. Importance a request to achieve some goal, depends on the priority of an agent depends on its position in the hierarchy relative associated with message and the importance of sender. to the position of the agent who is maintaining the policy. The Messages Handler fetches the goal attached with a Importance also depends on the services provided by the agent message and assigns a priority to the goal based on the priority during the life cycle of the agent. After receiving the message, associated with message and the importance of sender. A goal the agent analyzes the policy file. Importance of the agent fetched from a message of priority Preemptive High or High could be Normal or High. which is assigned by an agent having Normal importance in the policy file is assigned a High priority. It means that agent does not preempt his own goals for the goals assigned by an agent of Normal importance. A goal fetched from a message, sent by an agent having High importance in the policy file is assigned the same priority as of the message. After assigning a priority to the goal, the goal is added to one of the two global goals lists. A goal of priority Preemptive High is added to GRG list and a goal of priority High or Normal is added to GPG list. There is another messages queue maintained inside the agent, called Executor Messages Queue(EMQ). Messages which are sent by the Planner for the execution of actions are put in the EMQ. These are the triggering messages for the actions in the plan generated by the Planner. Number of mes- sages in EMQ are denoted by length(EM Q) and EM Q[i] denotes the ith message in EMQ. Each triggering message in EMQ has a time stamp associated with it. T imeStamp(M sg) denotes the time stamp associated with a triggering message M sg. B. Planner Planner starts when the multi-agent system is launched. Once started, the Planner procedure runs throughout the life cycle of the agent. When there are no goals in either of the goals lists then it sits idle and waits for new goals to arrive and as soon as a new goal arrives, it starts planning. Before starting the Planner, the agent goes through an initialization phase, in which it sets the values of certain global variables. Three states of the world are maintained in the system, SP (Act) which denotes the state of the world anticipated by planner just before the execution of the action Act, secondly Fig. 1. A Running P-CLAIM Agent SW is the current actual state of the world and F inalSP Algorithm 1 M ain Algorithm Algorithm 3 Compute P lan(S, G, D) 1: loop 1: P ← The Empty Plan 2: repeat 2: I ← S 3: Call T reat Reactive Goal 3: LG ← G 4: until GRG = φ 4: LG0 ← {g ∈ LG : no goal is constrained to precede g} 5: if GP G 6= φ then 5: loop 6: Fetch first goal g ∈ GP G 6: if LG = φ then 7: P P lan ← Compute P lan(F inalSP, g, D) 7: P lan ← Call T emporal Converter(I, P, D) 8: if P P lan 6= F ail then 8: Return P lan 9: for i = 1 To length(P P lan) do 9: end if 10: T imeStamp(ExeM essages[i]) ← 10: Non deterministically choose any g ∈ LG0 T imeStamp(P P lan[i]) 11: if g = Some P rimitive Action then 11: ExeM essages[i] ← T riggerM essage(P P lan[i]) 12: if g = Inf ormation Gathering T ask then 12: end for 13: Generate and send message with identity x, for information 13: Send ExeM essages to EMQ retrieval to other agent 14: end if 14: Put all tasks depending on g in Pending Tasks list and assign 15: end if them an identity x 16: end loop 15: Remove g from LG 16: else 17: A ← {(a, Θ) : a is a ground instance of an operator in D, Θ is a substitution that unifies {head(a), g}, and S satisfies a’s Algorithm 2 T reat Reactive Goal preconditions} 1: Fetch first goal g ∈ GRG 18: if A = φ then 2: Suspension Signal ← ON 19: Return F ail 3: Wait until {Execution Signal = OF F } 20: else 4: Start T ime ← Current system time 21: Non deterministically choose a pair (a, Θ) ∈ A 5: RP lan ← Compute P lan(SW, g, D) 22: S ← S + Add(a) − Del(a) 6: if RP lan 6= F ail then 23: Append a to P 7: for i = 1 To length(RP lan) do 24: Modify LG by removing g and applying Θ 8: T imeStamp(ExeM essages[i]) ← T imeStamp(RP lan[i]) 25: end if 9: ExeM essages[i] ← T riggerM essage(RP lan[i]) 26: end if 10: end for 27: LG0 ← {g ∈ LG : no other goal is constrained to precede g} 11: End T ime ← Current system time 28: else 12: Duration ← End T ime - Start T ime 29: M ← {(m, Θ) : m is an instance of a method in D, Θ unifies 13: for i = 1 To length(EM Q) do {head(m), g}, pre(m) is T rue in S, and m and Θ are as general 14: T imeStamp(EM Q[i]) ← T imeStamp(EM Q[i]) + as possible} Duration 30: if M = φ then 15: end for 31: Return F ail 16: Send ExeM essages to EMQ 32: end if 17: end if 33: Non deterministically choose pair (m, Θ) ∈ M 18: Suspension Signal ← OF F 34: Modify LG by removing g, adding sub(m), constraining each goal in sub(m) to precede the goals that g preceded, and applying Θ 35: if sub(m) 6= φ then 36: LG0 ← {g ∈ sub(m) : no goal in LG precedes g} denotes the state of the world to which the Planner has 37: else planned till now. More precisely, it is the state of the world 38: LG0 ← {g ∈ LG : no goal in LG precedes g} 39: end if anticipated by planner after the very last action that the Planner 40: end if has planned for. In the initialization phase F inalSP is set 41: if New acknowledgement in knowledge then equal to the SW . Suspension Signal is set to OF F and 42: id ← identity of the message whose acknowledgement has arrived 43: Fetch all goals associated with message id from Pending Tasks Execution Signal is set to ON . list and put in the LG The M ain Algorithm (Algorithm 1) runs in an infinite 44: end if 45: if G is a proactive goal then loop and ensures that reactive goals are immediately planned 46: repeat for and achieved. First it looks at the GRG list and if it is 47: Call T reat Reactive Goal not empty, (Lines 2-4) the control moves to the procedure 48: until GRG = φ 49: end if T reat Reactive Goal (Algorithm 2). Some of the notations 50: end loop used inside the T reat Reactive Goal procedure are as fol- lows. length(RP lan) denotes the number of actions in the plan RP lan. ExeM essages is an array of triggering mes- sages for the actions in the plan. T imeStamp(Act) denotes state of the world SW , the reactive goal just fetched g the time stamp assigned to an action Act for its execution. and domain description D are passed to Compute P lan T reat Reactive Goal fetches the first reactive goal g and procedure. This procedure call returns a temporal plan RP lan sets the Suspension Signal to ON to ask the Executor to for the reactive goal. Because every action in P-CLAIM is ex- suspend the execution and waits for the Execution Signal ecuted using a triggering message, so an array ExeM essages to go OF F which indicates that the Executor has suspended is generated containing the triggering messages for all the the execution (Lines 1-3) then it calls the Compute P lan actions in the temporal plan RP lan with a T imeStamp procedure to plan for the reactive goal (Line 5). The current associated with every message (Lines 7-10) and this array of Algorithm 4 T emporal Converter(I, P, D) goals (Line 29) and applies this method (Lines 33-34). If no 1: for j = 1 TO no of literals(I) do such method m exists then procedure returns failure (Lines 2: P roduction T ime(Literal(I[j])) ← 0 30-32). 3: end for 4: for i = 1 TO length(P ) do At the end of each planning step, the Compute P lan 5: T imeStamp(P [i]) ← M ax {P roduction T ime(P re(P [i][j])) procedure looks for any newly arrived acknowledgement for :j = 1 To no of pre(P [i])} an earlier sent message. If a new acknowledgement for a 6: P rereq(P [i]) ← Actions which achieve the preconditions of P [i] 7: SP (P [i]) ←World state anticipated before the execution of P [i] message with identity id has been arrived then the procedure 8: for j = 1 TO no of ef f ects(P [i]) do removes all the tasks depending on id, from Pending Tasks list 9: P roduction T ime(Literal(P [i][j])) ← T imeStamp(P [i]) and puts them in the Local Goals list to process those goals + Of f set(Literal(P [i][j])) 10: end for (Lines 41-44). 11: end for While planning for a proactive goal, the Compute P lan procedure checks GRG for any new goals after each plan- ning step and whenever it finds a goal in GRG, it sus- messages is sent to EMQ (Line 17) from where the Executor pends planning for the proactive goal and calls the procedure executes the actions triggered by these messages. But before T reat Reactive Goal , which we have explained earlier sending ExeM essages to EMQ, the T imeStamp of all the (Lines 45-49). When GRG becomes empty, procedure resumes messages currently in the EMQ is updated, because due to planning for the proactive goal from the same state at which the suspension of execution, those triggering messages can it had suspended the planning. While planning for a reactive not be executed at their intended time. So every message’s goal, the Compute P lan procedure does not look at GRG, T imeStamp is increased by the duration of the suspension because a new reactive goal is treated only when all the (Lines 13-15). Suspension Signal is then set to OF F (Line previous reactive goals have been treated. 18) to allow the Executor to resume execution and control is When Compute P lan finds a plan for one goal, it converts passed back to M ain Algorithm (Algorithm 1) which looks the total order plan into a temporal plan by calling the pro- for another goal in GRG. The M ain Algorithm turns its cedure T emporal Converter (Algorithm 4). The procedure attention to the proactive goals only when it finds that there is takes three parameters I, P and D, where I is the initial no reactive goal (Line 5). Algorithm fetches the first goal from state, P is the total order plan which is to be converted and GPG (Line 6). High priority goals are always fetched before D is the domain description file which is needed to extract Normal priority goals. Then Compute P lan procedure is the information about the durations of all the actions and called with the parameters F inalSP , g and D. A plan P P lan offsets of all the effects. Some notations used in the procedure is returned (Line 7 ) which is then sent to EMQ in the form are as follows. no of literals(I) denotes the number of of triggering messages (Lines 9-13). Now we elaborate the literals in the initial state and Literal(I[j]) points to the working of Compute P lan procedure (Algorithm 3) (Many j th literal in initial state. P roduction T ime(Lit) represents lines of the algorithm are taken from [13]). the time of achievement of a literal Lit. length(P ) returns Compute P lan procedure is an extension of JSHOP2[13] the number of actions in the plan P . no of pre(Act) and algorithm. It takes three parameters S, G and D as input, no of ef f ects(Act) denote the number of preconditions and where S is initial state, G is a list of goals and D is the agent’s number of effects of an action Act respectively while in the domain description. Compute P lan procedure has an internal same vein P re(P [i][j]) and Literal(P [i][j]) denote the j th goals list called Local Goals (LG) list . Algorithm chooses a precondition and j th effect of ith action in plan P respectively. goal g ∈ LG which has no predecessors (Line 4). At this point We have used a simple and efficient technique to convert a there could be two cases. The first case is if g is a primitive plan into temporal plan. The procedure starts by setting the task, then procedure finds an operator a that matches g and P roduction T ime of all the literals in the initial state to whose preconditions are satisfied in S. It applies the action a 0 (Lines 1-3). Then procedure loops through all the actions to state S and adds it to his plan P (Lines 17,21-23). If no starting from the first action, going towards the last one and such operator a exists then procedure returns failure (Lines sets the T imeStamp of the action to the maximum of the 18-19). In P-CLAIM a message to other agent is also treated P roduction T ime of all its preconditions, because an action as primitive action. So, g could be a message to other agent can be executed at least when all of its preconditions have for information retrieval. If this is the case, then a message for been achieved (Lines 4-5). After setting the T imeStamp of the retrieval of information is generated with identity x and an action, the procedure sets the P roduction T ime of all the is sent to other agent. And all the tasks which depend on this effects of the action. The production time of an effect is the information are put in the Pending Tasks list (Lines 11-15). T imeStamp of the action plus the time at which the effect is All these tasks are assigned same identity x as of the message produced by the action, the Of f set of the effect (Lines 8-10). before sending them to Pending Tasks list. The second case is where g is a compound goal, so a method C. Executor needs to be applied for the decomposition of g into its sub- The Executor is running in parallel with the Planner. It tasks. In this case the planner nondeterministically chooses a waits for triggering messages to come in the EMQ, fetches method instance m matching g, that decomposes g into sub- the messages and executes the actions associated with the Algorithm 5 Executor Moreover, after executing each action, the Executor checks 1: loop the Suspension Signal. When Suspension Signal is set 2: if Suspension Signal = ON then 3: Execution Signal ← OF F to ON , it turns Execution Signl to OF F , suspends the ex- 4: Wait until {Suspension Signal = OF F } ecution, and waits for Suspension Signal to go OF F . The 5: Execution Signal ← ON Executor resumes the execution once the Suspension Signal 6: end if 7: if EM Q 6= φ then is turned to OF F . But now the triggering messages for the 8: N extActions ← Fetch all next messages C from EMQ having plan of reactive goal are at the front of EMQ, so the Executor the earliest T imeStamp from current system time first executes the plan for the reactive goal for which it had 9: N extT ime ← T imeStamp(N extActions) 10: Wait for system time to reach N extT ime suspended the execution and then it resumes the execution of 11: for i = 1 TO length(N extActions) do plan on which it was working before the suspension (Line 29). 12: if All the actions in P rereq(N extActions[i]) has not sent acknowledgement for termination then 13: Wait for all the acknowledgements Algorithm 6 P lan M ender(I, G) 14: Duration ← Time spent waiting for acknowledgements 1: Generate a plan P using SATPLAN from I to G ignoring the duration 15: for i = 1 To length(EM Q) do of actions 16: T imeStamp(EM Q[i]) ← T imeStamp(EM Q[i]) + 2: T P ← Call T emporal Converter(I, P, D) Duration 3: Return T P 17: end for 18: end if 19: if SP (N extActions[i]) = SW then 20: Execute N extActions[i] in a separate thread D. Plan Mender 21: else 22: M P lan ← P lan M ender(SW, SP (N extActions[i])) This procedure is responsible for repairing the plan. It takes 23: Execute M P lan as input the current actual world state I and the anticipated 24: for i = 1 To length(EM Q) do world state G. The Plan Mender generates a temporal plan 25: T imeStamp(EM Q[i]) ← T imeStamp(EM Q[i]) + T imeSpan(M P lan) for the agent to reach the anticipated world state starting 26: end for from the current world state and returns this plan to the 27: Execute N extActions[i] in a separate thread Executor. The Plan mender uses the classical STRIPS style 28: end if 29: end for planning technique to compute its plan because now the goals 30: end if for the planner are a state to be reached (declarative goal). 31: end loop So, the Plan Mender just uses operators from the domain description file to compute the plan. In this case, the activities are not helpful in generating the plan which were used by the Planner component. The basic algorithm used by the Plan messages at their planned time stamps. Every running ac- Mender is shown in Algorithm 6. Plan mender computes a tion sends an acknowledgement just before its termination plan without taking into account the durations of the actions to the Executor. Algorithm 5 is a simplified version of using the SATPLAN planner[16] and then uses the procedure the Executor. The Executor fetches all the next messages T emporal Converter to convert the plan to a temporal plan. from EMQ that have the closest T imeStamp to the current system time. Then the Executor waits for the system time to reach the T imeStamp of these messages (Lines 10-11). When system time approaches that time, the Executor checks whether the prerequisite actions of the actions associated with these messages have been terminated or not. If they have not been terminated then it waits for their termination. And increases the T imeStamp of all the messages in EMQ by the duration of waiting for their termination (Lines 14- 20). Then it checks for any discrepency among the current world state and the one anticipated by the Planner for the execution of these actions. If there is no discrepency then Fig. 2. (a).ROCO Activities (b).Plan for Clean T able these actions are executed in a separate thread (Lines 21-22) and the Executor fetches the next messages from EMQ. But if there is discrepency among the two world states, then the V. E XAMPLE Executor calls the Plan Mender to generate a plan from the In this example scenario we have one mobile agent ROCO, current world state to the intended world state and executes which is a home servant. When the MAS is launched then the plan thus returned to remove the discrepency (Lines 24- ROCO has the goal Clean T able. ROCO has activity as- 25). After executing this plan the Executor is ready to execute sociated with this goal Clean T able. All the activities of the actions which it had suspended due to discrepency (Line ROCO are shown in tree form in Figure 2(a) (A rectangle 29). But before executing these actions, it augments their is showing a goal and an associated oval is an Activity T imeStamp by the duration of the discrepency removal. associated with the goal, rectangles at the bottom without an associated oval are Actions.). M ain Algorithm fetches goal After the execution of this plan ROCO is in OwnerRoom. Clean T able from GPG list. And calls Compute P lan to Now the Executor resumes its suspended plan but before plan for this goal which generates a plan consisting of the resuming the suspended plan, it increases the T imeStamp following actions M ove(Room1, T able), ArrangeBooks, of all the actions in the suspended plan by the T imeSpan ArrangeCover, Dusting. A short description of the actions of the plan for goal Bring W ater, then it checks whether in this plan is shown in Figure 3. the preconditions of the suspended plan hold in the current state. The preconditions of its suspended plan are At(ROCO, T able)∧Arranged(Books)∧Arranged(Cover) and the current state is Arranged(Books) ∧ Arranged(Cover) ∧ At(ROCO, OwnerRoom). The Executor calls Plan Mender to generate a plan from current state of the world to the intended state of the world. Plan Mender returns a plan consisting of only one action M ove(OwnerRoom, T able). Executor executes this plan, so Fig. 3. Description of actions in the temporal plan for Clean T able ROCO moves to Table. Now again the Executor checks for any discrepency among the current state and the anticipated The plan is converted to the temporal plan using the state but now both states are same so the Executor executes procedure T emporal Converter and the plan returned is the suspended plan i.e. it executes the Dusting action. shown in figure 2(b). In this example, all the effects of all the actions have an offset equal to the duration of the VI. R ELATED W ORK action. Here we explain the conversion of totally ordered In this section, we briefly review some work from the plan to temporal plan. Procedure starts by assigning the existing literature which is related to our work. Some of P roduction T ime of all the literals in the initial state to 0. the research related to ours is CYPRESS[17], RETSINA[18], There is only one literal At(ROCO, Room1) in the initial DECAF[19] and the systems proposed in [5] and [6]. state so P roduction T ime(At(ROCO, Room1)) is set to 0. In our opinion, the system closest to our research is CY- Now the procedure takes first action M ove(Room1, T able) PRESS system, which also integrates a planning system SIPE- and sets its T imeStamp to 0, which is the maximum 2[20] with an execution system PRS[21]. It has the ability P roduction T ime from all of its preconditions. Next, to react to the unanticipated changes in the environment by the procedure sets the P roduction T ime of the effects replanning and also deals with probabilistic planning. Our of M ove(Room1, T able). This action has only one effect approach has the added advantage of handling temporal knowl- At(ROCO, T able). P roduction T ime(At(ROCO, T able)) edge. Another aspect differentiating P-CLAIM to CYPRESS is assigned the value T imeStamp(M ove(Room1)) plus is the mobility of the agents. In P-CLAIM, the agents are Of f set(At(ROCO, T able)). Putting the values, we mobile so the context of an agent changes while moving from get P roduction T ime(At(ROCO, T able)) equals 1 one machine to another. The planner component must be able minute, because Of f set(At(ROCO, T able)) is equal to to deal with the changing context because the planning is the duration of M ove(Room1, T able). Now procedure interleaved with execution. An advantage of CYPRESS system moves to second goal which is Arrange Books over our proposed system is in the way CYPRESS performs and sets its T imeStamp to be the maximum of replanning. We suspend the execution while computing a plan P roduction T ime of all of its preconditions. It to remove any discrepencies. While CYPRESS system uses has only one precondition At(ROCO, T able) whose asynchronous replanning in which the system continues to P roduction T ime has already been calculated to 1 minute. execute the unaffected portion of the plan while a planning So T imeStamp(Arrange Books) is assigned 1 munite. In module computes a new plan. this way the procedure continues and finds the plan shown in Our system has many similarities with RETSINA. Like our figure 2(b). Planner sends the messages for each action of the system, RETSINA also interleaves planning with execution plan along with their T imeStamp to the EMQ for execution and supports planning for dynamic and real environments. and Executor starts executing the plan. When the Executor But one main difference of RETSINA system with our system has executed M ove(Room1, T able), ArrangeBooks is that RESTINA system plans by only reduction of the top and ArrangeCover, it checks that Suspension Signal level task and it does not plan among the top level tasks, but is set to ON , because the Planner has just fetched a our system uses a HTN planner which also plans among the reactive goal Bring W ater from the GRG. The Executor top level tasks. So the plan generated is more optimal in our suspends the execution, sets the Execution Signal system than in RETSINA system. Another main difference is to OF F and waits for the Suspension Signal to that RETSINA system does not use the existing information go to OF F again. It receives the following plan from the BDI system whereas our system proposes a method in the EMQ, M ove(T able, Kitchen), T akeGlass, to use the existing agent’s and world’s information. F illGlassW ithW ater, M ove(Kitchen, OwnerRoom), Another framework DECAF[19] which can be seen as an in- Give(Glass, Owner). Now the Executor executes this plan. spiration of RETSINA, relates to our system. But, in DECAF, the planner only estimates preconditions, select task templates [4] P. Busetta, R. Ronnquist, A. Hodgson, and A. Lucas, “Jack intelligent and instantiates them. It lacks the ability to anticipate future agents-components for intelligent agents in java,” AgentLink News Letter, vol. 2, pp. 2–5, 1999. actions. [5] L. de Silva and L. Padgham, “Planning on demand in BDI systems,” Like our system, [5] also provides a way to translate the Proc. of ICAPS-05 (Poster), 2005. information from a JACK[4] agent to the information needed [6] F. Meneguzzi, A. Zorzo, and M. da Costa Mora, “Propositional planning in BDI agents,” in Proceedings of the 2004 ACM symposium on Applied by JSHOP[22] planner. Main differences of this approach with computing. ACM New York, NY, USA, 2004, pp. 58–63. our approach are that in [5] it is the responsibility of the [7] de Silva et al., “First Principles Planning in BDI Systems,” in Proceed- programmer to specify the points where the planner should ings of the 8th international joint conference on Autonomous agents and multiagent systems, S. Decker, Sichman and C. (.eds), Eds., 2009, pp. be called while our system plans for each goal. Our system 1105–1112. has the ability to deal with the unanticipated changes in the [8] J. Penberthy and D. Weld, “Temporal planning with continuous change,” environment, while [5] has no such ability. in Proceedings of the national conference on Artificial Intelligence. John Wiley & Sons Ltd., 1995, pp. 1010–1010. Another framework incorporating planning in a BDI lan- [9] D. Smith and D. Weld, “Temporal planning with mutual exclusion guage is presented in [6]. It incorporates classical planning reasoning,” in International joint conference on artificial intelligence, into BDI framework. More precisely it extends the X-BDI[23] vol. 16. Lawrence Erlbaum Associates Ltd., 1999, pp. 326–337. [10] M. Do and S. Kambhampati, “Sapa: A domain-independent heuristic model to use the propositional planning algorithms for per- metric temporal planner,” in Proceedings of ECP-01, 2001, pp. 109– formaing means-end reasoning. Our hypothesis is that our 120. proposed system has the advantage of being more efficient [11] M. Ghallab and H. Laruelle, “Representation and control in IxTeT, a temporal planner,” in Proc. 2nd Int. Conf. on AI Planning Systems, 1994, as the HTN planning technique can find plans more rapidly pp. 61–67. with the help of additional domain knowledge provided by [12] A. El Fallah-Seghrouchni and A. Suna, “An unified framework for the programmer. Another important aspect is the loss of the programming autonomous, intelligent and mobile agents,” Lecture notes in computer science, Springer, pp. 353–362, 2003. domain knowledge provided by the programmer in [6]. The [13] D. Nau, T. Au, O. Ilghami, U. Kuter, J. Murdock, D. Wu, and F. Yaman, advantage of using the HTN planning is that the plans can “SHOP2: An HTN planning system,” Journal of Artificial Intelligence be synthesized according to the intentions of the programmer Research, vol. 20, no. 1, pp. 379–404, 2003. [14] S. Sardina and L. Padgham, “Hierarchical planning in BDI agent without loosing the domain knowledge. programming languages: A formal approach,” in Proceedings of the fifth international joint conference on Autonomous agents and multiagent VII. C ONCLUSION A ND F UTURE W ORK systems. ACM New York, NY, USA, 2006, pp. 1001–1008. [15] H. M. Adnan, “A Planning Component for CLAIM Agents,” in To In this paper, we have presented an extension to the CLAIM appear in the Proceedings of International Workshop On Multi-Agent Systems Technology and Semantics. IEEE Romania, 2009. language to endow the agents with the capability to plan ahead. [16] H. Kautz, B. Selman, and J. Hoffmann, “Satplan: Planning as satisfia- This modified and extended language is called P-CLAIM. bility,” in 5th International Planning Competition. Citeseer, 2006. Agents are able to create temporal plans. Execution monitoring [17] D. Myers, L. Wesley, and A. Center, “CYPRESS: Reacting and Planning under Uncertainty,” in DARPA Proceedings: Rome Laboratory Planning and plan repairing components are added. A balance between Initiative. Morgan Kaufmann, 1994, p. 111. deliberation and reactivity has been established and the agents [18] M. Paolucci, O. Shehory, K. Sycara, D. Kalp, and A. Pannu, “A planning are able to turn their attention while planning to the newly component for RETSINA agents,” Lecture notes in computer science, Springer, pp. 147–161, 2000. arrived reactive goals. This work can be considered as a first [19] J. Graham and K. Decker, “Towards a distributed, environment-centered step towards a comprehensive temporal planning solution for agent framework,” Lecture notes in computer science, Springer, pp. 290– an Agent Oriented Programming language. 304, 2000. [20] D. Wilkins, “Can AI planners solve practical problems?” Computational After creating the temporal plan for an agent but before its Intelligence, vol. 6, no. 4, pp. 232–246, 1990. execution, the plan of an agent should be coordinated with [21] M. Georgeff and A. Lansky, “Procedural knowledge,” Proceedings of the plans of those agents with which the plan could be in the IEEE, Special Issue on Knowledge Representation, vol. 74, no. 10, pp. 1383–1398, 1986. conflict or whose plans could be helpful for this agent. Our [22] D. Nau, Y. Cao, A. Lotem, and H. Muñoz-Avila, “SHOP: Simple hier- next task is to propose a coordination mechanism to coordinate archical ordered planner,” in Proceedings of the Sixteenth International the temporal plans of different agents. Coordinating the plan Joint Conference on Artificial Intelligence table of contents. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA, 1999, pp. 968–975. of agent with every other agent in the MAS is very costly, so [23] M. Mora, J. Lopes, R. Viccari, and H. Coelho, “BDI models and systems: another important task to do is to intelligently calculate the Reducing the gap,” in Proc. of ATAL-98, LNCS, vol. 1555. Springer. set of those agents whose plan could be in conflict or whose plans could be helpful for the agent and then the plan should be coordinated with only those agents. R EFERENCES [1] R. Bordini, J. Hubner, and R. Vieira, “Jason and the Golden Fleece of agent-oriented programming,” Multiagent systems artificial societies and simulated organizations, vol. 15, p. 3, 2005. [2] M. Dastani, M. van Riemsdijk, and J. Meyer, “Programming multi-agent systems in 3APL,” Multiagent systems artificial societies and simulated organizations, vol. 15, p. 39, 2005. [3] M. Dastani and J. Meyer, “A practical agent programming language,” Lecture Notes in Computer Science, vol. 4908, p. 107, 2008.