=Paper=
{{Paper
|id=Vol-2970/aspocppaper5
|storemode=property
|title=Modelling Human Mental-States in an Action Language following the Theory of Planned Behavior
|pdfUrl=https://ceur-ws.org/Vol-2970/aspocppaper5.pdf
|volume=Vol-2970
|authors=Andreas Brännström,Juan Carlos Nieves
|dblpUrl=https://dblp.org/rec/conf/iclp/BrannstromN21
}}
==Modelling Human Mental-States in an Action Language following the Theory of Planned Behavior==
Modelling Human Mental-States in an Action Language following the Theory of Planned Behavior Andreas Brännström, Juan Carlos Nieves Umeå University, Department of Computing Science, SE-901 87, Umeå, Sweden Abstract This paper introduces an action language for modelling the causality between a human’s motivational beliefs and behavior in human activities. The action language is based on a psychological theory, the theory of planned behavior (TPB), which centers on three sets of beliefs that shape an individual’s behavioral intentions: attitude, subjective norms, and perceived behavioral control. The language is modelled in the structure of a transition system in which states correspond to different mental states of an individual, and actions correspond to ways to transition between mental states in order to influence human intentions. We introduce its syntax and semantics. The semantics is characterized in terms of answer sets. Keywords Human-aware planning, Action reasoning, Answer set programming, Theory of mind, Theory of planned behavior 1. Introduction Human-aware planning is a way to improve the ability of autonomous systems to plan its actions in an environment populated and affected by humans [1]. A general problem in human-aware planning is to explain and model human behavior in order to predict future actions [2]. When an intelligent system’s task, in addition, is to promote, encourage and change human behavior, it is of particular relevance that the underlying motivations and causes to behavior is explained. Explaining human behavior is a difficult task due to its causal complexity [3], which can be analysed in multiple levels and abstractions. Computational models of human behavior have been based on, e.g., integrative physiology [4] on a low level, to, e.g., formalizations of social practices [5], i.e., typically performed behavior, on a high level. The present work aims to model human behavior from a personal, intermediate, level by looking at a human’s motivation to behavior, and the human’s underlying beliefs that shape motivation. The theory of planned behavior (TPB) [6] is a psychological theory for explaining an individual’s motivation to engage in a behavior in a specific context. TPB focuses on three core motivational components: (1) an individual’s attitude, e.g., expectations of the outcomes of a behavior, (2) subjective norms, e.g., an individual’s perceived social pressure from others to behave in a certain manner, and (3) ASPOCP’21: 14th Workshop on Answer Set Programming and Other Computing Paradigms, September 20–27, 2021, Virtual " andreasb@cs.umu.se (A. Brännström); jcnieves@cs.umu.se (J. C. Nieves) ~ https://www.umu.se/en/staff/andreas-brannstrom/ (A. Brännström); https://www.umu.se/en/staff/juan-carlos-nieves/ (J. C. Nieves) 0000-0001-9379-4281 (A. Brännström); 0000-0003-4072-8795 (J. C. Nieves) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) perceived behavioral control, e.g., an individual’s self-efficacy in a behavior. TPB bases these three sources of motivation on the individual’s particular beliefs in a moment, which together promote or demote engagement in the behavior. A way for reasoning about human behavior is to look at it from the perspective of states and transitions between states. Action languages, such as A [7] and CTAID [8], are declarative ways for specifying the behavior of transition systems, representing actions that result in transitions between states, i.e., configurations of environmental variables, in a domain. Such a domain can be a human mind, comprised of mental states including perceptions of the context/situation. By following the concepts introduced in TPB, this work aims to formalize the causality between human beliefs and behavior in an action language. Such a formalization can be utilized by a rational agent to reason about the behavioral intentions of a human agent, and in addition reason about the rational agent’s actions for changing a human agent’s beliefs to influence motivations and intentions. Modelling the causality between human beliefs and behavior involves identifying which beliefs and actions, out of many, that cause a behavior. This is a chain of direct and indirect effects that branch out to change mental states. This is a difficult task that involves the problem of ramification [9]. Furthermore, the priority of motivation must be taken into consideration, i.e., a human’s current mental state determines which source of motivation that is most effective. The human agent’s behavior must be determined through an indirect reasoning process where indicators to behavior are inferred through models of human reasoning. Given this action reasoning problem, and by considering components introduced in TPB, the following research questions arise: • How can the causality between human motivation and behavior be expressed by an action reasoning language? • How can an agent prioritise actions that cause change in the motivational beliefs in a human agent? We approach these questions by introducing the action language CT PB . It is modelled in the structure of a transition system in which states correspond to different mental states of an individual and an environment, and actions correspond to ways to transition between states in order to influence human intentions. In order to characterize the semantics of the language we establish a link between CT PB and Answer Set Programming (ASP) [10]. This is done by presenting encodings of the language into logic programs following the answer set semantics. An ASP formalization, in addition to a formal characterization, gives correct implementations through solvers like clasp [11]. To capture human behavior, the action language CT PB has been developed in a knowledge elicitation process related to a case study [12] that deals with social behavior-change in children with autism. Together with experts in, e.g., psychology, we aim to develop principles of design to model human behavior. The rest of this paper is organized as follows. First, we briefly present the theoretical framework; the theory of planned behavior. Syntax and semantics of the proposed action language is then presented which introduces an action reasoning approach to human-aware planning that utilizes the theory of planned behavior. The state-of-the-art in human-aware planning and agents modelling other agents is then presented and discussed. The paper is concluded by a discussion of the action language’s potential, limitations, possible use-cases, and directions for future work. 2. Theoretical Framework and Background This section explains the cognitive theory that has influenced the modelling of the proposed action reasoning framework for human-aware planning; Theory of Planned Behavior [6]. The results of a qualitative knowledge elicitation process is then presented that is captured in the semantics of the action language CT PB . 2.1. Theory of Planned Behavior Theory of Planned Behavior (TPB) [6], is a cognitive theory for explaining and predicting an individual’s intention to engage in a human activity at a specific time and place. The general idea is that the individual’s beliefs about an activity shapes the individual’s attitudes, subjective norm, and perceived behavioral control in the activity, which in turn promotes or inhibits engagement in the activity. Attitude (A) refers to the degree to which an individual has a positive or negative evaluation of performing the activity. This entails a consideration of the outcomes of the activity. The overall attitude towards the activity is a consideration of each expected outcome of the activity quantified by the individual’s valuation of that outcome. Subjective norm (SN) refers to the individual’s belief about whether people approve or dis- approve of the activity. The overall subjective norms towards the activity is a consideration of each normative belief of the activity quantified by the individual’s motivation to comply with that norm. Perceived behavioral control (PBC) refers to an individual’s perception of the ease or difficulty of performing the activity. The overall perceived behavioral control towards the activity is a consideration of each performance aspect of the activity quantified by the individual’s perceived controllability of that performance aspect. According to TPB, behavioral intention (BI) is the motivational factor that influences a given behavior, the likelihood that the human will initiate in an activity. Behavioral intention is the aggregation of the overall attitude, subjective norm, and perceived behavioral control. Specific activities can be more or less affected by these three predictors. Thus, an activity specific empirically derived weight is added to each predictor. The following subsection presents a qualitative data collection process centered on the compo- nents of motivation introduced in TPB. Based on these components, mental states are specified which are captured in a particular transition system. 2.2. Knowledge elicitation results: Motivation decision-graph This section presents the results of a knowledge elicitation process of a related use-case [12]. A qualitative study with 15 domain experts (psychologists, physiotherapists, occupational therapists, and special education teachers) was conducted. The study found that the most influential sources of motivation in a moment depends on an individual’s current motivational beliefs in terms of TPB. By assuming that the current mental state of an individual can be sufficiently recognized, i.e., the attitude (A), subjective norm (SN), and perceived behavioral control (PBC), and valuated in the scale Negative (N) (inhibiting behavior), Medium (M) (indifferent to behavior) and Positive (P) (promoting behavior), the participating experts were asked how a person’s motivation to a behavior should be boosted depending on the person’s current mental state. E.g., if the person currently has a mental state where A is negative, SN is negative and PBC is negative, then which aspect should be prioritised to boost? If so, what about a similar case where A is medium? By following this approach, 27 states are specified, each state consisting of the variables A, SN, or PBC, in which each variable has a value of negative, medium, or positive. Following the experts’ suggestions, we can specify a state space with transition relations between states that represents a prioritised change of A, SN or PBC depending on the recognized mental state of the human. A mental state is denoted as A:SN:PBC, labeled by the value of each variable (e.g., PPP if A is Positive, SN is Positive and PBC is Positive). We can model these relations in terms of a weighted graph, called a motivation decision-graph (see Figure 1 [12] in which the red/bold path is the experts’ prioritized path from state NNN to state PPP). Figure 1: Motivation decision-graph. By following the experts’ suggestions of motivational focus in terms of the 27 mental states, a set of trends in the behavior of the system were found: 1. Prioritize to push attitude away from a negative state whenever possible, i.e., to a medium or positive state. 2. Prioritize to push subjective norm away from a negative state, if attitude is at least medium and control is positive. 3. Prioritize to push control away from a negative state if an internal or external motivator is already present, i.e., if attitude or subjective norm is at least medium. 4. If all three aspects are medium, prioritize to push attitude to a positive state. 5. The aim is to reach the state PPP, where all aspects are positive. If this state is reached, then keep the variables steady or make small challenges in attitude, subjective norm, or control by pushing the state to any of the lower states MPP, PMP, or PPM. These behaviors of the motivation decision-graph is formalized by the semantics of the action language CT PB . The language’s syntax and semantics is presented in the following section. 3. Human Behavior in the Action Language CT PB This section presents the syntax and semantics of the proposed action language, CT PB , following the results of a knowledge elicitation process presented in Section 2.2, in the light of the theory of planned behavior. Action descriptions in answer set semantics is then presented together with an implementation of the motivation decision-graph illustrated in Figure 1, in order to characterize the semantics of the CT PB action language. 3.1. Human-Aware Transition Systems and Action Reasoning The alphabet of CT PB consists of two nonempty disjoint sets of symbols F and A. They are called the set of fluents F and the set of actions A. A fluent expresses a property of an object in a world, and forms part of the description of states of this world. A fluent literal is a fluent or a fluent preceded by ¬. A state σ is a collection of fluents (informal approximation, see Definition 3). We say a fluent f holds in a state σ if f ∈ σ . We say a fluent literal ¬ f holds in σ if f ∈ / σ. Definition 1 (Human-aware alphabet). Let A be a non-empty set of actions and F be a non- empty set of fluents. • F = FE ∪ FH such that FE is a non-empty set of fluent literals describing observable items in an environment and FH is a non-empty set of fluent literals describing the mental-states of humans. FE and FH are pairwise disjoint. • FH = FA ∪ FN ∪ FC such that FA , FN and FC are non-empty pairwise disjoint sets of fluent literals describing a human agent’s attitude, subjective norm and perceived behavioral control, respectively. • A = AE ∪ AH such that AE is a non-empty set of actions that can be performed by a software agent and AH is non-empty set of actions that can be performed by a human agent. AE and AH are pairwise disjoint. CT PB is defined by three sub-languages: an action description language, an action observation language and an action query language. Definition 2. A human-aware domain description language Dh (A, F) in CT PB consists of static and dynamic causal laws of the following form: (a causes f1 , . . . , fn if g1 , . . . gn ) (1) (a influences attitude f if f1 , . . . fn ) (2) (a influences subjective norm f if f1 , . . . fn ) (3) (a influences control f if f1 , . . . fn ) (4) ( f1 , . . . , fn influences attitude f ) (5) ( f1 , . . . , fn influences subjective norm f ) (6) ( f1 , . . . , fn influences control f ) (7) ( f1 , . . . , fn if g1 , . . . gm ) (8) ( f1 , . . . , fn triggers a) (9) ( f1 , . . . , fn allows a) (10) ( f1 , . . . , fn inhibits a) (11) ( f1 , . . . , fn promotes a) (12) ( f1 , . . . , fn demotes a) (13) (noconcurrency a1 , . . . , an ) (14) (default g) (15) where a ∈ A and ai ∈ A (0 ≤ i ≤ n) and f j ∈ F, (0 ≤ j ≤ n) and g j ∈ F, (0 ≤ j ≤ n), and f ∈ FH then f is a ternary first order predicate, in which parameter one represents a mental state fluent MS ∈ { Attitude, Sub jective_norm, Control}, parameter two represents the MS’s value V ∈ {Negative, Medium, Positive} and parameter three represents a point in time T, denoted as f(MS, V, T) or as f (_,V, _) when MS and T are explained by the context. The semantics of a domain description Dh (A, F) is defined in terms of transition systems. An interpretation I over F is a complete and consistent set of fluents. Definition 3. A state s ∈ S of the domain description Dh (A, F) is an interpretation over F such that 1. for every static causal law ( f1 , . . . , fn if g1 , . . . gn ) ∈ Dh (A, F), we have { f1 , . . . , fn } ⊆ s whenever {g1 , . . . gn } ⊆ s. 2. for every static causal law ( f1 , . . . , fn influences attitude f ) ∈ Dh (A, F), we have { f } ⊂ s whenever { f1 , . . . , fn } ⊆ s, and f ∈ F A ∧ ( f (_, Positive, _) ∈ s ∨ f (_, Medium, _) ∈ s), and (∃ fi ∈ F A (1 ≤ i ≤ n) ∧ fi (_, Negative, _) ∈ s, or ∃ fi ∈ F A (1 ≤ i ≤ n) ∧ ∃ f j ∈ F N (1 ≤ j ≤ n) ∧ ∃ fk ∈ F C (1 ≤ k ≤ n) ∧ fi (Medium) ∈ s ∧ f j (Medium) ∈ s ∧ fk (Medium) ∈ s). 3. for every static causal law ( f1 , . . . , fn influences subjective norm f ) ∈ Dh (A, F), we have { f } ⊂ s whenever { f1 , . . . , fn } ⊆ s, and f ∈ F N ∧ ( f (_, Positive, _) ∈ s ∨ f (_, Medium, _) ∈ s), and ∃ fi ∈ F A (1 ≤ i ≤ n) ∧ ( fi (_, Medium, _) ∈ s ∨ fi (_, Positive, _) ∈ s), and ∃ fk ∈ F C (1 ≤ k ≤ n) ∧ fk (_, Positive, _) ∈ s. 4. for every static causal law ( f1 , . . . , fn influences control f ) ∈ Dh (A, F), we have { f } ⊂ s whenever { f1 , . . . , fn } ⊆ s, and f ∈ F C ∧ ( f (_, Positive, _) ∈ s ∨ f (_, Medium, _) ∈ s), and ∃ fi ∈ F A (1 ≤ i ≤ n) ∧ ( fi (_, Medium, _) ∈ s ∨ fi (_, Positive, _) ∈ s), and ∃ fk ∈ F C (1 ≤ k ≤ n) ∧ ( fk (_, Medium, _) ∈ s ∨ fk (_, Positive, _) ∈ s). S denotes all the possible states of Dh (A, F). Definition 4. Let Dh (A, F) be a domain description and s a state of Dh (A, F). 1. An inhibition rule ( f1 , . . . , fn inhibits a) is active in s, if s |= f1 , . . . , fn , otherwise the inhibition rule is passive. The set AI (s) is the set of actions for which there exists at least one active inhibition rule in s. 2. A triggering rule ( f1 , . . . , fn triggers a) is active in s, if s |= f1 , . . . , fn and all inhibition rules of action a are passive in s, otherwise the triggering rule is passive in s. The set AT (s) is the set of actions for which there exists at least one active triggering rule in s. The set AT (s) is the set of actions for which there exists at least one triggering rule and all triggering rules are passive in s. 3. An allowance rule ( f1 , . . . , fn allows a) is active in s, if s |= f1 , . . . , fn and all inhibition rules of action a are passive in s, otherwise the allowance rule is passive in s. The set AA (s) is the set of actions for which there exists at least one active allowance rule in s. The set AA (s) is the set of actions for which there exists at least one allowance rule and all allowance rules are passive in s. 4. A promoting rule ( f1 , . . . , fn promotes a) is active in s, if a ∈ AH and s |= f1 , . . . , fn and all inhibition rules and demoting rules of action a are passive in s, otherwise the promoting rule is passive in s. The set AP (s) is the set of actions for which there exists at least one active promoting rule in s. The set AP (s) is the set of actions for which there exists at least one promoting rule and all promoting rules are passive in s. 5. A demoting rule ( f1 , . . . , fn demotes a) is active in s, if a ∈ AH and s |= f1 , . . . , fn and all inhibition rules and promoting rules of action a are passive in s, otherwise the demoting rule is passive in s. The set AD (s) is the set of actions for which there exists at least one active demoting rule in s. The set AD (s) is the set of actions for which there exists at least one demoting rule and all demoting rules are passive in s. 6. A dynamic causal law (a causes f1 , . . . , fn if g1 , . . . , gn ) is applicable in s, if s |= g1 , . . . , gn . 7. A static causal law ( f1 , . . . , fn if g1 , . . . , gn ) is applicable in s, if s |= g1 , . . . , gn . 8. A dynamic causal law (a influences attitude f if f1 , . . . , fn ) is applicable in s, if s |= f1 , . . . , fn , and f ∈ F A , and ∃ fi ∈ F A (1 ≤ i ≤ n), and ∃ f j ∈ F N (1 ≤ j ≤ n), and ∃ fk ∈ F C (1 ≤ k ≤ n). 9. A dynamic causal law (a influences subjective norm f if f1 , . . . , fn ) is applicable in s, if s |= f1 , . . . , fn , and f ∈ F N , and ∃ fi ∈ F A (1 ≤ i ≤ n), and ∃ f j ∈ F N (1 ≤ j ≤ n), and ∃ fk ∈ F C (1 ≤ k ≤ n). 10. A dynamic causal law (a influences control f if f1 , . . . , fn ) is applicable in s, if s |= f1 , . . . , fn , and f ∈ F C , and ∃ fi ∈ F A (1 ≤ i ≤ n), and ∃ f j ∈ F N (1 ≤ j ≤ n), and ∃ fk ∈ F C (1 ≤ k ≤ n). 11. A static causal law ( f1 , . . . , fn influences attitude f) is applicable in s, if s |= f1 , . . . , fn , and f ∈ F A , and ∃ fi ∈ F A (1 ≤ i ≤ n), and ∃ f j ∈ F N (1 ≤ j ≤ n), and ∃ fk ∈ F C (1 ≤ k ≤ n). 12. A static causal law ( f1 , . . . , fn influences subjective norm f) is applicable in s, if s |= f1 , . . . , fn , and f ∈ F N , and ∃ fi ∈ F A (1 ≤ i ≤ n), and ∃ f j ∈ F N (1 ≤ j ≤ n), and ∃ fk ∈ F C (1 ≤ k ≤ n). 13. A static causal law ( f1 , . . . , fn influences control f) is applicable in s, if s |= f1 , . . . , fn , and f ∈ F C , and ∃ fi ∈ F A (1 ≤ i ≤ n), and ∃ f j ∈ F N (1 ≤ j ≤ n), and ∃ fk ∈ F C (1 ≤ k ≤ n). Definition 5 (Trajectory). Let Dh (A, F) be a domain description. A trajectory ⟨s0 , A1 , s1 , A2 , . . . , An , Sn ⟩ of Dh (A, F) is a sequence of sets of actions Ai ⊆ A and states si of Dh (A, F) satisfying the following conditions for 0 ≤ i < n: 1. (si , A, si+1 ) ∈ S × 2A \{} × S 2. AT (si ) ⊆ Ai+1 3. AP (si ) ⊆ Ai+1 4. AD (si ) ⊆ Ai+1 5. AT (si ) ∩ Ai+1 = 0/ 6. AA (si ) ∩ Ai+1 = 0/ 7. AI (si ) ∩ Ai+1 = 0/ 8. AP (si ) ∩ Ai+1 = 0/ 9. AD (si ) ∩ Ai+1 = 0/ 10. |Ai ∩ B| ≤ 1 f or all (noconcurrency B) ∈ Dh (A, F). Definition 6. The action observation language of CT PB consists of expressions of the following form: ( f at ti ) (a occurs_at ti ) (16) where f ∈ F, a is an action and ti is a point of time. Definition 7 (Action Theory). Let Dh be a domain description and O be a set of observations. The pair (Dh , O) is called an action theory. Definition 8 (Trajectory Model). Let (Dh , O) be an action theory. A trajectory ⟨s0 , A1 , s1 , A2 , . . . , An , sn ⟩ of Dh is a trajectory model of (Dh , O), if it satisfies all observations of O in the following way: 1. if ( f at t) ∈ O, then f ∈ st 2. if (a occurs_at t) ∈ O, then a ∈ At+1 . Let us observe that given a trajectory ⟨s0 , A1 , s1 , A2 , . . . , An , sn ⟩ where Ai ⊆ A (0 ≤ i ≤ n) and A is a set of actions that can be performed either by a human-agent or a software planner-agent. Actions performed by a human-agent can be movement between areas in the environment, while actions by a software planner-agent are adaptations of the environment that indirectly influences a human-agent’s mental-state fluents, i.e., attitude, subjective norm and control. Definition 9 (Action Query Language). The action query language of CT PB regards assertions about executing sequences of actions with expressions that constitute trajectories. A query is of the following form: ( f1 , . . . , fn after Ai occurs_at ti , . . . , Am occurs_at tm ) where f1 , . . . , fn are fluent literals ∈ FE ∪ FH , Ai , . . . , Am are sub-sets of AE ∪ AH , and ti , . . . , tm are points in time. 3.2. Action Descriptions in Answer Set Semantics This section presents translations in answer set programs of the expressions as part of CT PB . These expressions incorporate the result from the knowledge elicitation process, specified to capture explanations of human behavior and behavior-change. Descriptions of expression 1, 9, 10, 11, 14, and 15 follows the definitions in [8]. In the following translations, the value of a mental state variable, i.e., attitude, subjective norm and control, is represented by h ∈ {Positive, Medium, Negative} in a time point T . Actions are coupled with a planner agent (agent) or a human agent (human). (a influences attitude h if f1 , . . . fn ) (2) attitude(h, T + 1) : − holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), occurs(a, T ), action(a, agent),time(T ), T < n, #int(T ), motivate(attitude, h, T ). (a influences subjective norm h if f1 , . . . fm ) (3) sub jective_norm(h, T + 1) : − holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), occurs(a, T ), action(a, agent),time(T ), T < n, #int(T ), motivate(sub jective_norm, h, T ). (a influences control h if f1 , . . . fm ) (4) control(h, T + 1) : − holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), occurs(a, T ), action(a, agent),time(T ), T < n, #int(T ), motivate(control, h, T ). ( f1 , . . . , fn influences attitude h) (5) attitude(h, T + 1) : − holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), time(T ), T < n, #int(T ), motivate(attitude, h, T ). ( f1 , . . . , fn influences subjective norm h) (6) sub jective_norm(h, T + 1) : − holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), time(T ), T < n, #int(T ), motivate(sub jective_norm, h, T ). ( f1 , . . . , fn influences control h) (7) control(h, T + 1) : − holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), time(T ), T < n, #int(T ), motivate(control, h, T ). ( f1 , . . . , fn promotes a) (12) holds(occurs(a), T + 1) : − not holds(ab(occurs(a)), T + 1), holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), action(a, human),time(T ), T < n, #int(T ). ( f1 , . . . , fn demotes a) (13) holds(ab(occurs(a)), T + 1) : − holds( f1 , T ), . . . , holds( fn , T ), f luent( f1 ), . . . , f luent( fn ), action(a, human),time(T ), T < n, #int(T ). The above expressions can be modelled according to a specific domain, to capture beliefs of an environment and their influence on human behavior. Accompanying each expression, a domain independent set of rules are applied to filter out solution candidates in a generated trajectory. This is expressed by the predicate motivate which adds a set of integrity constraints based on the motivation decision-graph (presented in Subsection 2.2), defined below. Definition 10. A motivation decision-graph MDG is a transition system that is a tuple of the form MDG = (M, Act, T, O, MP, L) where M is a non-empty set of states, denoting mental states of a human agent, Act is a set of actions, T ⊆ M × M is a non-empty set of transition relations denoting legal transitions between mental states, O is a set of initial states, MP a set of atomic propositions, and L is a function that defines which propositions ∈ MP valid in each state ∈ M. In the following definition, we introduce a logic program that characterize the motivation decision-graph of Figure 1. This logic program will help to define the semantics of CT PB . Definition 11. Let PMDG be the following logic program: 1 value(1..3). % 1: Negative, 2: Medium, 3: Positive. 2 mind(attitude). mind(norm). mind(control). 3 4 init_on(attitude,1). init_on(norm,1). init_on(control,1). 5 goal_on(attitude,3). goal_on(norm,3). goal_on(control,3). 6 plan_length(6). #show motivate/3. 7 { motivate(MS,V,T) : mind(MS), value(V) } = 1 :- plan_length(M), T = 1..M. 8 9 motivate(MS,T) :- motivate(MS,_,T). 10 on(MS,V,0) :- init_on(MS,V). 11 on(MS,V,T) :- motivate(MS,V,T). 12 on(MS,V,T+1) :- on(MS,V,T), not motivate(MS,T+1), not plan_length(T). 13 14 blocked(MS,V-1,T+1) :- on(MS,V,T), not plan_length(T). 15 blocked(MS,V-1,T) :- blocked(MS,V,T), value(V). 16 17 % C-TPB: Integrity constraints 18 :- motivate(MS,V,T), blocked(MS,V,T). 19 :- motivate(MS,T), on(MS,V,T-1), blocked(MS,V,T). 20 :- goal_on(MS,V), not on(MS,V,M), plan_length(M). 21 :- { on(MS,V,T) } != 1, mind(MS), plan_length(M), T = 1..M. 22 23 % Restrict Attitude 24 :- motivate(attitude, 1, T+1), on(attitude, 1, T). 25 :- motivate(attitude, 2, T+1), on(attitude, 2, T). 26 :- motivate(attitude, 3, T+1), on(attitude, 3, T). 27 :- motivate(attitude, 3, T+1), on(attitude, 1, T). 28 :- motivate(attitude, 1, T+1), on(attitude, 3, T). 29 :- motivate(attitude, 2, T+1), on(attitude, 3, T). 30 31 % Restrict Norm 32 :- motivate(norm, 1, T+1), on(norm, 1, T). 33 :- motivate(norm, 2, T+1), on(norm, 2, T). 34 :- motivate(norm, 3, T+1), on(norm, 3, T). 35 :- motivate(norm, 3, T+1), on(norm, 1, T). 36 :- motivate(norm, 1, T+1), on(norm, 3, T). 37 :- motivate(norm, 2, T+1), on(norm, 3, T). 38 39 % Restrict Control 40 :- motivate(control, 1, T+1), on(control, 1, T). 41 :- motivate(control, 2, T+1), on(control, 2, T). 42 :- motivate(control, 3, T+1), on(control, 3, T). 43 :- motivate(control, 3, T+1), on(control, 1, T). 44 :- motivate(control, 1, T+1), on(control, 3, T). 45 :- motivate(control, 2, T+1), on(control, 3, T). 46 47 % Push attitude whenever possible to a medium or positive state. 48 :- motivate(norm, 2, T+1), on(attitude, 1, T). 49 :- motivate(norm, 3, T+1), on(attitude, 1, T). 50 :- motivate(norm, 2, T+1), on(attitude, 2, T). 51 :- motivate(control, 2, T+1), on(attitude, 1, T). 52 :- motivate(control, 3, T+1), on(attitude, 1, T). 53 :- motivate(control, 2, T+1), on(attitude, 2, T). 54 55 % Push subjective norm if attitude is at least medium and control is positive. 56 % This constraint is mostly covered above, extended with: 57 :- motivate(control, 3, T+1), on(norm, 2, T). 58 59 % Push control if attitude or subjective norm is at least medium. 60 % This constraint is mostly covered above, extended with: 61 :- motivate(norm, 3, T+1), on(control, 1, T). The above logic program ran in Clingo 4.5.4 generates trajectories following the motivational transition-graph in Figure 1. For instance, one get the following trajectory: motivate(attitude,2,1), motivate(attitude,3,2), motivate(control,2,3), motivate(norm,2,4), motivate(norm,3,5), motivate(control,3,6) The first parameter in motivate/3 corresponds to a mental state fluent, e,g., Attitude. The second parameter in motivate/3 corresponds to the value of a mental state fluent, where value(1) equals to Negative, value(2) equals to Medium, and value(3) equals to Positive. The third parameter in motivate/3 corresponds to a point in time. The motivation decision-graph, prototyped in the above logic program (shared and openly accessible online1 ), works as domain-independent heuristics for any domain-specific logic program following the CT PB action language. In order to define the semantics of CT PB , we characterize trajectory models in terms of answer sets. This is formalized by the following theorem: Theorem 1. Let (Dh , Oinitial ) be an action theory such that Oinitial are the fluent observations of the dynamic environment in the current state, and the fluents of the currently estimated mental state of the human (Attitude, Subjective norm, and Control). Let Q be a query, according to Definition 9 and let AQ = {(a occurs_at ti ) | a ∈ Ai , 1 ≤ i ≤ m}. 1 Source of the motivation decision-graph prototype: https://git.io/Ju9vh Let T denote the translation of CT PB into a logic program according to the mapping introduced by Section 3.2. Then, the following statements hold true: 1. If there is a trajectory model ⟨s0 , A1 , s1 , A2 , . . . , An , sm ⟩ where Ai ⊆ A (0 ≤ i ≤ m) of CT PB (Dh , Oinitial ∪ AQ ), then there is an answer set A of logic program T (CT PB (Dh , Oinitial ∪ AQ ) ∪ PMDG , such that for all fluents f ∈ FE ∪ FH at the time points 0 ≤ k ≤ m (a) holds( f , k) ∈ A , if sk |= f , (b) holds(neg( f ), k) ∈ A , if sk |= ¬ f . (c) holds(occurs(a), k) ∈ A , if a ∈ Ak+1 (d) holds(neg(occurs(a)), k) ∈ A , if a ∈ / Ak+1 2. If there is an answer set A of a program T (CT PB (Dh , Oinitial ∪ AQ ) ∪ PMDG and at time point 0 ≤ k ≤ m (a) sk = { f | holds( f , k) ∈ A } ∪ {¬ f | holds(neg( f ), k) ∈ A } (b) Ak+1 = {a | holds(occurs(a), k) ∈ A } then there is a trajectory model ⟨s0 , A1 , s1 , A2 , . . . , Am , sm ⟩ of CT PB (Dh , Oinitial ∪ AQ ). Proof 1. (Sketch) Let us start by observing that CTAID [8] is a sub-language of CT PB . Hence, giving a CT PB program PT PB there is a CTAID program PTAID such that PTAID ⊆ PT PB . Let TTAID be the translation of PTAID into a logic program according to [8], and, let TT PB denote the translation of PT PB into a logic program according to the mapping introduced by Section 3.2. One can see that TTAID is a subset of TT PB . One can observe that: Since PMDG is only inferring paths in a graph, then if A ∪ A is an answer set of TT PB such that A ⊆ LTTAID then A is an answer set of PTAID . Since Axioms 2-7 are similar to axiom 8 in Definition 2 and the axioms 12-13 are similar to axiom 9 in Definition 2, the proof follows by Theorem 1 in [8]. 3.3. Case study: Promote social behavior in children with autism This subsection exemplifies a use-case of CT PB . Children with autism commonly experience difficulties in social situations. Lights, sounds, people, and lack of guidance can result in stress and avoiding behavior. Let us consider an interactive virtual reality (VR) aid for children with autism for practicing social situations [13]. A rational agent is embedded in the application, which has the ability to adapt the virtual environment in order to promote the child to explore the scenario. The agent models the human using CT PB . A domain description Dh (A, F) in CT PB can include: AE = {increase_sound(L), decrease_sound(L), increase_light(L), decrease_light(L), increase_guides(L), decrease_guides(L), increase_people(L), decrease_people(L)}. AH = {move(Position)}. FE = {sound(L), light(L), people(L), guides(L)}. FA = {attitude(V)}. FN = {norm(V)}. FC = {control(V)}. L ∈ {Low, Medium, High}. V ∈ {Negative, Medium, Positive}. A set of causal laws is declared, such as: increase_guides(High) influences_control(Medium) if control(Low). Based on an initial environment state (e.g., people(High), light(High), sound(High), guides(Low)) and an initial mental state (e.g., attitude(Medium), norm(Negative), control(Negative)), adaptation- plans are generated for promoting human actions. For instance, the following plan: increase_guides(High); decrease_light(Medium); decrease_sound(Medium). 4. Discussion and Related Work Previously, we have done work on context-reasoning (through Activity Theory [14, 15]) and now our work deals with the interaction process (through human-aware planning). The introduced issues of modelling mental states of a human in an action language have not been explored so far by the community of formal action reasoning. The proposed action reasoning language CT PB provides a human-aware alphabet for describing motivational aspects in an environment and the mental-states of humans, as well as actions by which variables of the mental state or the environment directly or indirectly can be changed. There is a diverse body of research related to the ideas presented in the current work [16, 17, 18]. For instance, plan recognition as planning, originally introduced by Ramirez and Geffner [19], use planning algorithms to enable an agent to recognize the goals and plans of other agents. A related line of research introduces Empathetic planning [16]. In their work, empathy is defined as the ability to understand and share the thoughts and feelings of another. Following this definition, an assistive empathetic agent is formalized able to reason about the preferences of an empathizee [16]. The current work can advance the state-of-the-art of empathic agents by using a formalization of a psychological theory, the theory of planned behavior (TPB), in an attempt to model human beliefs and motivation. Usually, when we talk about theory of mind, i.e., agents modelling other agents [20], we say that the agent has beliefs about the human’s beliefs, but what does the human believe? In the CT PB action language, the agent can reason about particular beliefs about the human’s beliefs, i.e., the human’s attitude, subjective norm and perceived behavioral control in a specific behavior. In this way, the agent’s theory of the mind of the human is particular and concise, making it possible to deliberate about causes to an individual’s behavior, and how to promote human actions. 5. Conclusion and Future Work We have introduced the action language CT PB based on a psychological theory, the theory of planned behavior, explaining a human’s intention to engage in a human activity, and presented how the language can be applied to represent and reason about actions for altering mental-states to promote behavior. By utilizing the action language, an agent can reason about specific human beliefs, i.e., the human’s attitude, subjective norm and perceived behavioral control in a situation. In this way, the agent acquires a particular theory of the mind of the human to deliberate about a human agent’s intentions. On a low level, the language captures a human agent’s beliefs about the environment and how these beliefs correspond to mental states. On a high level, the language captures a suitable direction of motivation based on a human’s current mental state. In this way, a priority of motivation can be utilized for picking the most suitable actions to alterate the environment in order to change the human’s motivational beliefs and influence human behavior. Future work concerns incorporating expressions dealing with probability distributions in human behavior into the language. Furthermore, we aim to explore ways to incorporate customized decision-graphs into the semantics of CT PB , thus going beyond hardwired transition rules. For instance, we aim to develop an emotion decision-graph which extends the semantics of CT PB with models to reason about human emotions and emotional-change. References [1] T. Chakraborti, Foundations of Human-Aware Planning-A Tale of Three Models, Ph.D. thesis, Arizona State University, 2018. [2] M. Cirillo, L. Karlsson, A. Saffiotti, Human-aware task planning: An application to mobile robots, ACM Transactions on Intelligent Systems and Technology (TIST) 1 (2010) 1–26. [3] F. I. Dretske, Explaining behavior: Reasons in a world of causes, MIT press, 1991. [4] D. Fass, R. Lieber, Rationale for human modelling in human in the loop systems design, in: 2009 3rd Annual IEEE Systems Conference, IEEE, 2009, pp. 27–30. [5] F. Dignum, Interactions as social practices: towards a formalization, arXiv preprint arXiv:1809.08751 (2018). [6] I. Ajzen, et al., The theory of planned behavior, Organizational behavior and human decision processes 50 (1991) 179–211. [7] M. Gelfond, V. Lifschitz, Action languages (1998). [8] S. Dworschak, S. Grell, V. Nikiforova, T. Schaub, J. Selbig, Modeling Biological Networks by Action Languages via Answer Set Programming, Constraints 13 (2008) 21–65. [9] L. Giordano, A. Martelli, C. Schwind, Ramification and causality in a modal action logic, Journal of logic and computation 10 (2000) 625–662. [10] V. Lifschitz, Answer set programming, Springer International Publishing, 2019. [11] M. Gebser, R. Kaminski, B. Kaufmann, M. Ostrowski, T. Schaub, S. Thiele, A user’s guide to gringo, clasp, clingo, and iclingo (2008). [12] A. Brännström, J. C. Nieves, T. Kampik, E. Domellöf, L. Gu, M. Liljeström, Human-aware planning in virtual reality for facilitating social behavior in autism (2021). Manuscript submitted for publication. [13] B. Andreas, T. Kampik, J. C. Nieves, Towards human-aware epistemic planning for promoting behavior-change, in: Workshop on Epistemic Planning (EpiP)@ ICAPS, Online, October 26-30, 2020, 2020. [14] E. Guerrero, J. C. Nieves, H. Lindgren, An activity-centric argumentation framework for assistive technology aimed at improving health, Argument & Computation 7 (2016) 5–33. [15] J. Oetsch, J.-C. Nieves, A knowledge representation perspective on activity theory, arXiv preprint arXiv:1811.05815 (2018). [16] M. Shvo, S. A. McIlraith, Towards empathetic planning, arXiv preprint arXiv:1906.06436 (2019). [17] J. Blount, M. Gelfond, Reasoning about the intentions of agents, in: Logic Programs, Norms and Action, Springer, 2012, pp. 147–171. [18] A. Gabaldon, Activity recognition with intended actions, in: Twenty-First International Joint Conference on Artificial Intelligence, 2009. [19] M. Ramırez, H. Geffner, Plan recognition as planning, in: Proceedings of the 21st international joint conference on Artifical intelligence. Morgan Kaufmann Publishers Inc, 2009, pp. 1778–1783. [20] S. V. Albrecht, P. Stone, Autonomous agents modelling other agents: A comprehensive survey and open problems, Artificial Intelligence 258 (2018) 66–95.