LTL๐‘“ Goal-oriented Service Composition Giuseppe De Giacomo1,2 , Marco Favorito3 and Luciana Silo2,4 1 University of Oxford, UK 2 Sapienza University of Rome, Italy 3 Banca dโ€™Italia, Italy 4 Camera dei Deputati, Italy Abstract Service compositions ร  la Roman model consist of realizing a virtual service by orchestrating suitably a set of already available services, where all services are described procedurally as (possibly nondeterministic) transition systems. In this paper, we study a goal-oriented variant of the service composition ร  la Roman Model, where the goal specifies the allowed traces declaratively via Linear Temporal Logic on finite traces (ltl๐‘“ ). Specifically, we want to synthesize a controller to orchestrate the available services to produce together a trace satisfying a specification in ltl๐‘“ . To do so, we combine techniques from reactive synthesis, FOND Planning, and the Roman Model for service composition. This framework has several interesting applications, including Smart Manufacturing and Digital Twins. Keywords Service Composition, Linear Temporal Logic on finite traces, LTL๐‘“ Synthesis, FOND Planning 1. Introduction Service composition, a well-established topic in the field of Web services, refers to the ability to combine Web services into a business process. More properly, it involves managing and sequencing interactions between Web services, orchestrating them into a larger transaction. This approach enhances the flexibility and adaptability of business processes by enabling them to be constructed from reusable services, allowing organizations to quickly adjust their processes in response to changing business requirements or technological advancements. For example, a transaction involving the addition of a customer to a bank account service could concurrently initiate the creation of multiple accounts while updating the customer information within the customer service. All of these requests are managed in the context of a larger business process flow that either succeeds or fails as a whole [1]. The problem of service composition has been considered in the literature for over two decades. Particularly interesting in this context is the so-called Roman Model [2, 3, 4] where services are conversational, i.e., have an internal state and are procedurally described as finite transition systems (TS), where at each state the service offers a certain set of actions, and each action changes the state of the service in some way. The designer is interested in generating a new service, called target, which is described as the other service; however, it is virtual in the sense that no code is associated with its actions. Therefore, to execute the target, one has to delegate each of its actions to some of the available services by suitably orchestrating them, taking into consideration the state of the target and the available services are in. Service composition amounts to synthesizing a controller that can suitably orchestrate the executions of the available services so as to guarantee that the target actions are always delegated to some service that can actually execute them in its current state. The original paper on the Roman Model [2] has been the inspiration for a line of work in AI on behaviour composition where nondeterminism, in the sense of partial controllability as in Fully-Observable Non-Deterministic (FOND) strong planning [5, 6]) has played a prominent role [7]. Recently a renewed interest in service composition ร  la Roman PMAI@ECAI24: International ECAI Workshop on Process Management in the AI era, October 19, 2024, Santiago De Compostela, Spain $ degiacomo@diag.uniroma1.it (G. D. Giacomo); marco.favorito@bancaditalia.it (M. Favorito); silo@diag.uniroma1.it (L. Silo)  0000-0001-9680-7658 (G. D. Giacomo); 0000-0001-9566-3576 (M. Favorito); 0000-0001-7250-8979 (L. Silo) ยฉ 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Model is stemming out of applications in smart manufacturing, where, through digital twins technology, manufacturing devices can export their behaviour as transition systems and hence being orchestrated very much in the same way as service did back in the early 2000โ€™s, see e.g., [8, 9, 10]. Interestingly, these new applications are also promoting to move from a procedural specification of the target to a declarative one, as advocated by the declarative business processes literature, through the so-called declare paradigm [11, 12, 13]. In other words, the target would ideally be specified in declare, and so, in Linear Temporal Logic on finite and process traces (ltl๐‘“ ) [14], with the assumption that specifications are about the possible sequence of actions (vs sequences of fluent values of the domain as in planning for ltl๐‘“ goals [15, 16]), and the simplification that only one action can be selected at each point in time [11, 17]. In fact, ltl๐‘“ as a specification of the target can be utilized to write two different kinds of target specification, namely process-oriented target specification or goal-oriented target specification. In the first case, very much like in the declare paradigm, one uses the ltl๐‘“ specification to specify the process itself consisting of all the traces satisfying the ltl๐‘“ formula, which in turn corresponds to implicitly specifying the transition system consisting of the deterministic finite automaton (DFA) equivalent to the ltl๐‘“ formula [14]. In this case, after a preprocessing of the ltl๐‘“ specification to obtain the target transition system (i.e., the DFA corresponding to the formula) as in [14], the composition can be performed as with the techniques used for the standard Roman Model [7]. In this paper, we study the case where ltl๐‘“ is used as a goal-oriented specification. This is a novel variant of the composition problem, where we are given a ltl๐‘“ goal, and we want to synthesize an orchestrator that, on the one hand, reactively chooses actions to form a sequence that satisfies the goal and, on the other hand, delegates each action to an available service in such a way that at any point the delegated action can be executed by the delegated service and at the end of the sequence satisfying the ltl๐‘“ formula, all services are in their final states. Specifically, we consider the available services as nondeterministic, i.e., partially controllable (similarly to FOND strong planning) as in [3, 7]. Note that this problem is different from other goal-oriented service composition frameworks. In [18] Hierarchical Task Network (HTN) planning is used for service composition. HTN planning is based on the notion of composite tasks that can be refined to atomic tasks using predefined methods. While based on high-level specification of services, their approach does not support nondeterministic services. The work [19] allows the modeling of nondeterministic behaviors but not of stateful services nor high-level temporal goal specification. Authors in [20] describe services as atomic actions where only I/O behaviour is modelled, and the ontology is constituted by propositions and actions; hence services are not stateful as ours. De Giacomo et al. [21] study a stochastic version in a goal-oriented setting in which the optimization of the cost utilization is subordinated to the maximal probability of satisfaction of the goal by means of lexicographic optimization. Here instead we assume nondeterministic services. To provide a solution technique in this case, from the ltl๐‘“ specification, we compute in linear time a symbolic representation of the corresponding Nondeterministic Finite Automaton (NFA), we do so by essentially adopting the Alternating Automaton (AFA) associated to the formula as the symbolic representation of the NFA. Then we adapt the interleaving procedure introduced in [22] to encode the symbolic NFA corresponding to ltl๐‘“ temporally extended goals into special planning actions domains. However, while in [22] considers these symbolic NFAs in deterministic planning domains, we use the technique in a nondeterministic (adversarial) planning domain that encodes all the possible service actions at each point of the computation. Notably, this is possible in our case because the target specification is an ltl๐‘“ specification of the desired sequence of target actions and not of fluent evaluation as in standard planning. Note that, the possibility of exploiting symbolic NFAs contrasts with the need of adopting DFAs required for the process-oriented target specification, which are exponentially larger than corresponding NFAs in the worst case. We implemented our approach and evaluated different heuristics and Torres and Baierโ€™s encodings to show the feasibility of our solution technique. Although this paper has a foundational nature, our paper gives the foundations and solution tech- niques of goal-oriented compositions, which are indeed envisioned in the current literature on smart man- ufacturing where the notion of goal-oriented target specification is increasingly championed [23, 8, 9]. The appendix with the proofs and experimental results can be found at this link: https://bit.ly/3x5lXre. 2. Preliminaries Automata theory. A deterministic finite automaton (dfa) is a tuple ๐’œ = โŸจ๐’ซ, ๐‘„, ๐‘ž0 , ๐น, ๐›ฟโŸฉ where: (i) ๐’ซ is the alphabet, (ii) ๐‘„ is a finite set of states, (iii) ๐‘ž0 is the initial state, (iv) ๐น โІ ๐‘„ is the set of accepting states and (v) ๐›ฟ : ๐‘„ ร— ๐’ซ โ†’ ๐‘„ is a total transition function. A nondeterministic finite automaton (nfa) is defined similarly to dfa except that ๐›ฟ is defined as a relation, i.e. ๐›ฟ โІ ๐‘„ ร— ๐’ซ ร— ๐‘„. An alternating finite automaton (afa) [24, 25] is a generalization of dfa and nfa, where ๐›ฟ is defined as ๐›ฟ : ๐‘„ ร— ฮฃ โ†’ ๐ต + (๐‘„), where ๐ต + (๐‘„) is a set of positive boolean formulas whose atoms are states of ๐‘„. By โ„’(๐’œ), we mean the set of all traces over ฮฃ accepted by an automaton ๐’œ. An afa ๐’œ๐ด = โŸจ๐’ซ, ๐‘„๐ด , ๐‘ž0 , ๐น๐ด , ๐›ฟ๐ด โŸฉ can be โ‹€๏ธ€ transformed into an equivalent nfa ๐’œ๐‘ = โŸจ๐’ซ, 2๐‘„๐ด , {๐‘ž0 }, 2๐น๐ด , ๐›ฟ๐‘ โŸฉ, where ๐›ฟ๐‘ = {(๐‘žยฏ, ๐‘Ž, ยฏ๐‘ž โ€ฒ ) | ยฏ๐‘ž โ€ฒ |= ๐‘žโˆˆ๐‘žยฏ ๐›ฟ๐ด (๐‘ž, ๐‘Ž)}. Note that ๐’œ๐‘ can be constructed on-the-fly, see [22]. LTL๐‘“ is a variant of Linear Temporal Logic (ltl) interpreted over finite traces [14]. Given a set ๐’ซ of atomic propositions, ltl๐‘“ formulas ๐œ™ are defined by ๐œ™ ::= ๐‘Ž | ยฌ๐œ™ | ๐œ™ โˆง ๐œ™ | โˆ˜๐œ™ | ๐œ™ ๐’ฐ ๐œ™, where ๐‘Ž denotes an atomic proposition in ๐’ซ, โˆ˜ is the next operator, and ๐’ฐ is the until operator. We use abbreviations for other Boolean connectives, as well as the following: eventually as โ—‡๐œ™ โ‰ก ๐‘ก๐‘Ÿ๐‘ข๐‘’ ๐’ฐ ๐œ™; always as โ–ก๐œ™ โ‰ก ยฌโ—‡ยฌ๐œ™; weak next as โˆ™๐œ™ โ‰ก ยฌโˆ˜ยฌ๐œ™ (note that, on finite traces, ยฌโˆ˜๐œ™ is not equivalent to โˆ˜ยฌ๐œ™); and weak until as ๐œ™1 ๐’ฒ ๐œ™2 โ‰ก (๐œ™1 ๐’ฐ ๐œ™2 โˆจ2๐œ™1 ), i.e. ๐œ™1 holds until ๐œ™2 or forever. ltl๐‘“ formulas are interpreted on finite (possibly empty) traces ๐‘Ž = ๐‘Ž0 . . . ๐‘Ž๐‘›โˆ’1 where ๐‘Ž๐‘– at instant ๐‘– is a propositional interpretation over the alphabet 2๐’ซ , and ๐‘› is the length of the trace. An ltl๐‘“ formula can be transformed into an equivalent afa in linear time in the size of the formula, in a nfa in at most EXPTIME and into an equivalent and in a dfa in at most 2EXPTIME [14]. ltl๐‘“ is used in declarative process specification in BPM, through the so called DECLARE paradigm [11]. In this case โ‹๏ธ€ it is assumed that โ‹€๏ธ€ only one proposition (corresponding to an action) is true at every time point: ๐œ‰๐’ซ = 2( ๐‘Žโˆˆ๐’ซ ๐‘Ž) โˆง 2( ๐‘Ž,๐‘โˆˆ๐’ซ,๐‘Žฬธ=๐‘ ๐‘Ž โ†’ ยฌ๐‘). We call this the declare assumption, and we do adopt it in this paper. FOND Planning. A Fully-Observable Non-Deterministic (FOND) domain model can be formalized as a tuple ๐’Ÿ = โŸจโ„ฑ, ๐ด, pre, eff โŸฉ where โ„ฑ is a set of positive literals, ๐ด is a set of action labels, pre and eff are two functions that define the preconditions and effects of each action ๐‘Ž โˆˆ ๐ด. A planning state ๐‘  is a subset of โ„ฑ, and a positive literal ๐‘“ holds true in ๐‘  if ๐‘“ โˆˆ ๐‘ ; otherwise, ๐‘“ is false in ๐‘ . Both functions pre and eff take an action label ๐‘Ž โˆˆ ๐ด as an input and return a propositional formula over โ„ฑ and a set {eff 1 , . . . , eff ๐‘› } of effects, respectively. Each effect eff ๐‘– โˆˆ eff (๐‘Ž) is a set of conditional effects each of the form ๐‘ โ— ๐‘’, where ๐‘ is a propositional formula over โ„ฑ and ๐‘’ โІ โ„ฑ โˆช {ยฌ๐‘“ | ๐‘“ โˆˆ โ„ฑ} is a set of literals from โ„ฑ. Sometimes we write ๐‘’ as a shorthand for the unconditional effect โˆ… โ— ๐‘’. An action ๐‘Ž can be applied in a state ๐‘  if pre(๐‘Ž) holds true in ๐‘  (i.e., ๐‘  |= pre(๐‘Ž)). A conditional effect ๐‘ โ— ๐‘’ is triggered in a state ๐‘  if ๐‘ is true in ๐‘ . Applying ๐‘Ž in ๐‘  yields a successor state ๐‘ โ€ฒ determined by an outcome nondeterministically drawn from eff (๐‘Ž). Let eff ๐‘– โˆˆ eff (๐‘Ž) be the chosen nondeterministic effect, the new state ๐‘ โ€ฒ is such that โˆ€๐‘“ โˆˆ โ„ฑ, ๐‘“ holds true in ๐‘ โ€ฒ if and only if either (i) ๐‘“ was true in ๐‘  and no conditional effect ๐‘ โ— ๐‘’ โˆˆ eff ๐‘– triggered in ๐‘  deletes it (ยฌ๐‘“ โˆˆ ๐‘’) or (ii) there is a conditional effect ๐‘ โ— ๐‘’ โˆˆ eff ๐‘– triggered in ๐‘  that adds it (๐‘“ โˆˆ ๐‘’). In case of conflicting effects, similarly to other works [26], we assume delete-before-adding semantics. We use ๐›ฟ(๐‘ , ๐‘Ž) to denote the set of possible successor states {๐‘ โ€ฒ1 , . . . , ๐‘ โ€ฒ๐‘› } obtained by executing ๐‘Ž in ๐‘ . Note that if ๐‘  ฬธ|= pre(๐‘Ž) then ๐›ฟ(๐‘ , ๐‘Ž) = โˆ…. A FOND planning problem is a tuple ฮ“ = โŸจ๐’Ÿ, ๐‘ 0 , ๐บโŸฉ, where ๐’Ÿ is a domain model, ๐‘ 0 โІ โ„ฑ is the initial state, and ๐บ is a formula over โ„ฑ, also called the reachability goal. We now define what it means to solve a planning problem on ๐’Ÿ. A FOND planning problem is a tuple ฮ“ = โŸจ๐’Ÿ, ๐‘ 0 , ๐บโŸฉ, where ๐’Ÿ is a domain model, ๐‘ 0 โІ โ„ฑ is the initial state, and ๐บ is a formula over โ„ฑ specifying the goal states. A trace of ฮ“ is a finite or infinite sequence ๐‘ 0 , ๐‘Ž0 , ๐‘ 1 , ๐‘Ž1 , . . . where ๐‘ 0 is the initial state, and ๐‘ ๐‘– |= pre(๐‘Ž๐‘– ) and ๐‘ ๐‘–+1 = ๐›ฟ(๐‘ ๐‘– , ๐‘Ž๐‘– ) for each ๐‘ ๐‘– , ๐‘Ž๐‘– in the trace. A strategy (or plan) is a partial function ๐œ‹ : (2โ„ฑ )+ โ†’ ๐ด such that for every ๐‘ข โˆˆ (2โ„ฑ )+ , if ๐œ‹(๐‘ข) is defined then last(๐‘ข) |= pre(๐œ‹(๐‘ข)), i.e., it selects applicable actions. If ๐œ‹(๐‘ข) is undefined, we write ๐œ‹(๐‘ข) = โŠฅ. A trace ๐œ is generated by ๐œ‹, or simply an ๐œ‹-trace, if (i) if ๐‘ 0 , ๐‘Ž0 , . . . , ๐‘ ๐‘– , ๐‘Ž๐‘– is a prefix of ๐œ then ๐œ‹(๐‘ 0 ๐‘ 1 . . . ๐‘ ๐‘– ) = ๐‘Ž๐‘– , and (ii) if ๐œ is finite, say ๐œ = ๐‘ 0 , ๐‘Ž0 , . . . , ๐‘Ž๐‘›โˆ’1 , ๐‘ ๐‘› , then ๐œ‹(๐‘ 0 ๐‘ 1 . . . ๐‘ ๐‘› ) = โŠฅ. A strategy ๐œ‹ is a (strong) solution to ฮ“ if every ๐œ‹-trace is a finite trace ๐œ such that ๐‘ ๐‘› |= ๐บ. 3. Goal-oriented ltl๐‘“ Service Composition Our composition framework follows the Roman model in the case the available services are non- deterministic. Unlike the Roman model, we have a high-level specification of a goal to accomplish expressed as an ltl๐‘“ formula. We want to accomplish such a goal despite the available services having nondeterministic behaviour. We detail our framework below. Following the Roman Model [2, 3, 7], each (available) service is defined as a tuple ๐’ฎ = โŸจฮฃ, ๐ด, ๐œŽ0 , ๐น, ๐›ฟโŸฉ where: (i) ฮฃ is the finite set of service states, (ii) ๐ด is the finite set of servicesโ€™ actions, (iii) ๐œŽ0 โˆˆ ฮฃ is the initial state, (iv) ๐น โІ ฮฃ is the set of final states (i.e., states in which the computation may stop but does not necessarily have to), and (v) ๐›ฟ โІ ฮฃ ร— ๐ด ร— ฮฃ is the service transition relation. For convenience, we define ๐›ฟ(๐œŽ, ๐‘Ž) = {๐œŽ โ€ฒ | (๐œŽ, ๐‘Ž, ๐œŽ โ€ฒ ) โˆˆ ๐›ฟ}, and we assume that for each state ๐œŽ โˆˆ ฮฃ and each action ๐‘Ž โˆˆ ๐ด, there exist ๐œŽ โ€ฒ โˆˆ ฮฃ such that (๐œŽ, ๐‘Ž, ๐œŽ โ€ฒ ) โˆˆ ๐›ฟ (possibly ๐œŽ โ€ฒ is an error state ๐œŽ๐‘ข that will never reach a final state). Actions in ๐ด denote interactions between service and clients. The behavior of each available service is described in terms of a finite transition system that uses only actions from ๐ด. Our target specification consists of a goal specification ๐œ™ expressed in ltl๐‘“ over the set of propositions ๐ด. Given a community of ๐‘› services ๐’ž = {๐’ฎ1 , . . . , ๐’ฎ๐‘› }, where each set of actions ๐ด๐‘– โІ ๐ด, a trace of ๐’ž is a finite or infinite alternating sequence of the form ๐‘ก = (๐œŽ10 . . . ๐œŽ๐‘›0 ), (๐‘Ž0 , ๐‘œ0 ), (๐œŽ11 . . . ๐œŽ๐‘›1 ), (๐‘Ž1 , ๐‘œ1 ) . . . , where ๐œŽ๐‘–0 is the initial state of every service ๐’ฎ๐‘– and, for every 0 โ‰ค ๐‘˜, we have (i) ๐œŽ๐‘–๐‘˜ โˆˆ ฮฃ๐‘– for all ๐‘– โˆˆ {1, . . . , ๐‘›}, (ii) ๐‘œ๐‘˜ โˆˆ {1, . . . , ๐‘›}, (iii) ๐‘Ž๐‘˜ โˆˆ ๐ด, and (iv) for all ๐‘–, ๐œŽ๐‘–,๐‘˜+1 = ๐›ฟ๐‘– (๐œŽ๐‘–๐‘˜ , ๐‘Ž๐‘˜ ) if ๐‘œ๐‘˜ = ๐‘–, otherwise ๐œŽ๐‘–,๐‘˜+1 = ๐œŽ๐‘–๐‘˜ . Given a trace ๐‘ก, we call states(๐‘ก) the sequence of states of ๐‘ก, i.e. states(๐‘ก) = (๐œŽ10 . . . ๐œŽ๐‘›0 ), (๐œŽ11 . . . ๐œŽ๐‘›1 ), ยท ยท ยท . The choices of a trace ๐‘ก, denoted with choices(๐‘ก), is the sequence of actions in ๐‘ก, i.e. choices(๐‘ก) = (๐‘Ž0 , ๐‘œ0 ), (๐‘Ž1 , ๐‘œ1 ), . . . . Note that, due to nondeterminism, there might be many traces of ๐’ž associated with the same sequence of choices. Moreover, we define the action run of a trace ๐‘ก, denoted with actions(๐‘ก), the projection of choices(๐‘ก) only to the components in ๐ด. Note that both choices(๐‘ก) and actions(๐‘ก) are empty if ๐‘ก = (๐œŽ10 . . . ๐œŽ๐‘›0 ). A finite trace ๐‘ก is successful, denoted with successful(๐‘ก), if (1) actions(๐‘ก) |= ๐œ™, and (2) all service states ๐œŽ๐‘– โˆˆ last(states(๐‘ก)) are such that ๐œŽ๐‘– โˆˆ ๐น๐‘– . An orchestrator is a partial function ๐›พ : (ฮฃ1 ร— ยท ยท ยท ร— ฮฃ๐‘› )+ โ†’ (๐ด ร— {1 . . . ๐‘›}) โˆช {โŠฅ} that, if defined given a sequence of states ๐œŽ ยฏ = (๐œŽ10 . . . ๐œŽ๐‘›0 ) . . . (๐œŽ1๐‘š . . . ๐œŽ๐‘›๐‘š ), returns the action to perform ๐‘Ž โˆˆ ๐ด, and the service (actually the service index) that will perform it; otherwise we write ๐›พ(๐œŽ ยฏ ) = โŠฅ. Next, we define when an orchestrator is a composition that satisfies ๐œ™. A trace ๐‘ก is an execution of an orchestrator ๐›พ with ๐’ž if for all ๐‘˜ โ‰ฅ 0, we have (๐‘Ž๐‘˜ , ๐‘œ๐‘˜ ) = ๐›พ((๐œŽ10 . . . ๐œŽ๐‘›0 ) . . . (๐œŽ1๐‘˜ . . . ๐œŽ๐‘›๐‘˜ )) and, if ๐‘ก is finite, say of length ๐‘š, then ๐›พ((๐œŽ10 . . . ๐œŽ๐‘›0 ) . . . (๐œŽ1,๐‘šโˆ’1 . . . ๐œŽ๐‘›,๐‘šโˆ’1 )) = โŠฅ. Note that due to the nondeterminism of the services, we can have many executions for the same orchestrator, despite the orchestrator being a deterministic function. We say that some finite execution ๐‘ก of ๐›พ is successful, if successful(๐‘ก) and ๐›พ(states(๐‘ก)) = โŠฅ. Finally, we say that an orchestrator ๐›พ realizes the ltl๐‘“ goal specification ๐œ™ with ๐’ž if all its executions are finite traces ๐‘ก that are successful. Note that the orchestrator, at every step, chooses the action and the service to which the action is delegated. In doing so, it guarantees that the sequence of actions satisfies the ltl๐‘“ goal specification and that at each step the action is delegated to a service that can actually carry out the action, despite the nondeterminism of the services. Moreover, when the orchestrator stops, all services are left in their final states. The composition problem is: Problem 1 (Composition for ltl๐‘“ Goal Specifications). Given the pair (๐’ž, ๐œ™), where ๐œ™ is an ltl๐‘“ goal specification over the set of propositions ๐ด, and ๐’ž is a community of ๐‘› services ๐’ž = {๐’ฎ1 , . . . , ๐’ฎ๐‘› }, compute, if it exists, an orchestrator ๐›พ that realizes ๐œ™. Example 1. We present an example of using our framework, inspired by the โ€œgarden bots systemโ€ example [27]. The goal is to clean the garden by picking fallen leaves and removing dirt, water the plants, and pluck the ripe fruits and flowers. The action clean must be performed at least once, followed by water and pluck in any order. In declare ltl๐‘“ , the goal can be expressed as ๐œ™ = clean โˆง โˆ˜(clean ๐’ฐ ((water โˆง โˆ˜pluck ) โˆจ (pluck โˆง โˆ˜water ))). We assume there are three available garden bots, each with different capabilities and rewards. In Figure 1 the three services specifications and the automaton ๐’œ๐œ™ of the ltl๐‘“ goal ๐œ™ are shown. Such an automaton ๐’œ๐œ™ is a dfa in this simple case, instead of a (proper) nfa. We are interested in a composition of the โ„ฌ1 โ„ฌ2 water โ„ฌ3 ๐’œ๐œ™ ๐‘Ž0 clean ๐‘0 ๐‘0 ๐‘”0 ๐‘”2 pl bots to satisfy the goal ๐œ™. Bot 1 will be used uc k clean r to perform clean. Although both bot 2 and 3 y te pluck empty plu pt clean ๐‘”4 empty ck wa em โŠค plu ck can be used for pluck , a strong solution cannot clean ๐‘”1 ๐‘”3 โŠค er choose bot 2 because the action pluck can lead ๐‘Ž1 ๐‘1 ๐‘2 ๐‘1 at pluck w water to the failure state ๐‘2 ; therefore, pluck will be requested to bot 3. Bot 2 will be used for water . Figure 1: From left to right: the three available bots The order in which water and pluck are exe- of the garden bot systems, and the automa- cuted is irrelevant since both alternatives lead ton ๐’œ๐œ™ of the ltl๐‘“ goal. to the accepting state. Both bot 1 and bot 2 might need to be emptied to return to the initial accepting states ๐‘Ž0 and ๐‘0 , respectively, and the solution must handle this. 4. Solution Technique To synthesize the orchestrator, we rely on a game-theoretic technique: i.e.: (i) we build a game arena where the controller (roughly speaking the orchestrator) and the environment (the service community) play as adversaries; (ii) we synthesize a strategy for the controller to win the game whatever the environment does; (iii) from this strategy we will build the actual orchestrator. Specifically, we proceed as follows: (1) first, from the ltl๐‘“ goal specification we compute the equivalent nfa; (2) in this nfa, we can give the control of the transition to the controller, constructing from the nfa a dfa ๐’œact over an extended alphabet; (3) compute a product of such dfa ๐’œact with the services, obtaining a new dfa ๐’œ๐œ™,๐’ž ; (4) the latter dfa can be seen as an arena over which we play is the so-called dfa game [28]; (5) if a solution of such dfa game is found, from that solution, we can derive an orchestrator that realizes ๐œ™. We now detail each step. Step 1. The nfa ๐’œ๐œ™ = (๐ด, ๐‘„, ๐‘ž0 , ๐น, ๐›ฟ) of an ltl๐‘“ formula, which can be exponentially larger than the size of the formula, can be computed by exploiting a well-known correspondence between ltl๐‘“ formulas and finite-word automata [14]. Note that we can build ๐’œ๐œ™ in such a way that its alphabet is ๐ด and not 2๐ด since, by the DECLARE assumption, only one action is executed at each time instant. Step 2. From the nfa of the formula ๐œ™, ๐’œ๐œ™ , which is on the alphabet ๐ด, we define a controllable dfa ๐’œact = (๐ด ร— ๐‘„, ๐‘„, ๐‘ž0 , ๐น, ๐›ฟact ) on the alphabet ๐ด ร— ๐‘„, with the same states, initial state and final state of ๐’œ๐œ™ but with ๐›ฟact defined as follows: ๐›ฟact (๐‘ž, (๐‘Ž, ๐‘ž โ€ฒ )) = ๐‘ž โ€ฒ iff (๐‘ž, ๐‘Ž, ๐‘ž โ€ฒ ) โˆˆ ๐›ฟ. Note that the โ€œangelicโ€ nondeterminism of ๐’œ๐œ™ is cancelled by moving the choice of the next nfa state and the next system service state in the alphabet ๐’œ ร— ๐‘„ of the dfa ๐’œact . Intuitively, with the dfa ๐’œact , we are giving to the controller not only the choice of actions but also the choice of transitions of the original nfa ๐’œ๐œ™ , so that those chosen transitions lead to the satisfaction of the formula. In other words, for every sequence of actions ๐‘Ž0 , . . . , ๐‘Ž๐‘šโˆ’1 accepted by the nfa ๐’œ๐œ™ , i.e. satisfying the formula ๐œ™, there exists a corresponding alternating sequence ๐‘ž0 , ๐‘Ž0 , . . . , ๐‘ž๐‘š accepted by the dfa ๐’œact , and viceversa. Step 3. Then, given ๐’œact and ๐’ž, we build the composition dfa ๐’œ๐œ™,๐’ž (๏ธ€ โ‹ƒ๏ธ€= (๐ด )๏ธ€ , ๐‘„ , ๐‘ž0 , ๐น , ๐›ฟ ) asโ€ฒ fol- โ€ฒ โ€ฒ โ€ฒ โ€ฒ โ€ฒ lows: ๐ด = {(๐‘Ž, ๐‘ž, ๐‘–, ๐œŽ๐‘— ) | (๐‘Ž, ๐‘ž, ๐‘–, ๐œŽ๐‘— ) โˆˆ ๐ด ร— ๐‘„ ร— {1, . . . , ๐‘›} ร— โ€ฒ ๐‘– ฮฃ๐‘– and ๐œŽ๐‘— โˆˆ ฮฃ๐‘– }; ๐‘„ = ๐‘„ ร— ฮฃ1 ร— ยท ยท ยท ฮฃ๐‘› ; ๐‘ž0 = (๐‘ž0 , ๐œŽ10 . . . ๐œŽ๐‘›0 ); ๐น = ๐น ร— ๐น1 ร— ยท ยท ยท ร— ๐น๐‘› ; ๐›ฟ ((๐‘ž, ๐œŽ1 . . . ๐œŽ๐‘– . . . ๐œŽ๐‘› ), (๐‘Ž, ๐‘ž โ€ฒ , ๐‘–, โ€ฒ โ€ฒ โ€ฒ ๐œŽ๐‘–โ€ฒ )) = (๐‘ž โ€ฒ , ๐œŽ1 . . . ๐œŽ๐‘–โ€ฒ . . . ๐œŽ๐‘› ) iff ๐›ฟ๐‘– (๐œŽ๐‘– , ๐‘Ž) = ๐œŽ๐‘–โ€ฒ , and ๐›ฟact (๐‘ž, (๐‘Ž, ๐‘ž โ€ฒ )) = ๐‘ž โ€ฒ . Intuitively, the dfa ๐’œ๐œ™,๐’ž is a synchronous cartesian product between the nfa ๐’œ๐œ™ and the service ๐’ฎ๐‘– chosen by the cur- rent symbol (๐‘Ž, ๐‘ž, ๐‘–, ๐œŽ) โˆˆ ๐ดโ€ฒ . It can be shown that there is a relationship between the accept- ing runs of the dfa ๐’œ๐œ™,๐’ž and the set of successful executions of some orchestrator ๐›พ with com- munity ๐’ž for the specification ๐œ™. Given a word (๐‘Ž0 , ๐‘ž1 , ๐‘œ0 , ๐œŽ๐‘œ0 ) . . . (๐‘Ž๐‘šโˆ’1 , ๐‘ž๐‘š , ๐‘œ๐‘šโˆ’1 , ๐œŽ๐‘œ๐‘šโˆ’1 ) โˆˆ ๐ดโ€ฒ* , which induces the run ๐‘Ÿ = (๐‘ž0 , ๐œŽ10 . . . ๐œŽ๐‘›0 ), . . . , (๐‘ž๐‘š , ๐œŽ1๐‘š . . . ๐œŽ๐‘›๐‘š ) over ๐’œ๐œ™,๐’ž , we define the history โ„Ž = ๐œ๐œ™,๐’ž (๐‘ค) = (๐œŽ10 . . . ๐œŽ๐‘›0 ), (๐‘Ž0 , ๐‘œ0 ), . . . , (๐œŽ1๐‘š . . . ๐œŽ๐‘›๐‘š ). We consider the dfa ๐’œ๐œ™,๐’ž as a dfa game. Step 4. dfa games are games between two players, here called respectively the environment and the controller, that are specified by a dfa. We have a set ๐’ณ of uncontrollable symbols, which are under the control of the environment, and a set ๐’ด of controllable symbols, which are under the control of the controller. A round of the game consists of both the controller and the environment choosing the symbols they control. A (complete) play is a word in ๐’ณ ร— ๐’ด describing how the controller and environment set their propositions at each round till the game stops. A play is winning for the controller if such a play leads from the initial to a final state. A strategy for the controller is a function ๐‘“ : ๐’ณ * โ†’ ๐’ด that, given a history of choices from the environment, decides which symbols ๐’ด to pick next. In this context, a history has the form ๐‘ค = (๐‘‹0 , ๐‘Œ0 ) ยท ยท ยท (๐‘‹๐‘šโˆ’1 , ๐‘Œ๐‘šโˆ’1 ). Let us denote by ๐‘ค๐’ณ |๐‘˜ the sequence projected only on ๐’ณ and truncated at the ๐‘˜-th element (included), i.e., ๐‘ค๐’ณ |๐‘˜ = ๐‘‹0 ยท ยท ยท ๐‘‹๐‘˜ . A winning strategy is a strategy ๐‘“ : ๐’ณ * โ†’ ๐’ด such that for all sequences ๐‘ค = (๐‘‹ 0 , ๐‘Œ 0 ) ยท ยท ยท (๐‘‹๐‘šโˆ’1 , ๐‘Œ๐‘šโˆ’1 ) with ๐‘Œ๐‘– = ๐‘“ (๐‘ค๐’ณ |๐‘˜ ), we have that ๐‘ค leads to a final state of our dfa game. The realizability problem consists of checking whether there exists a winning strategy. The synthesis problem amounts to actually computing such a strategy. A DFA game can be solved by backward least-fixpoint computation, by computing the winning region Win(๐’ข), that is, the states where the controller has a winning strategy to reach a final state. First, we start with Win 0 (๐’ข) = ๐น , and Win ๐‘– (๐’ข) is the set of states for which the controller can force the game to move in a state of Win ๐‘–โˆ’1 (๐’ข). Let Win(๐’ข) โІ ๐‘„ be the smallest set that satisfies the winning condition. It can be shown that a DFA game ๐’ข admits a winning strategy iff ๐‘ 0 โˆˆ Win(๐’ข) [28]. The resulting strategy is a transducer ๐‘‡ = (๐’ณ ร— ๐’ด, ๐‘„โ€ฒ , ๐‘ž0โ€ฒ , ๐›ฟ๐‘‡ , ๐œƒ๐‘‡ ), defined as follows: ๐’ณ ร— ๐’ด is the input alphabet, ๐‘„โ€ฒ is the set of states, ๐‘ž0โ€ฒ is the initial state, ๐›ฟ๐‘‡ : ๐‘„โ€ฒ ร— ๐’ณ โ†’ ๐‘„โ€ฒ is the transition function such that ๐›ฟ๐‘‡ (๐‘ž, ๐‘‹) = ๐›ฟ โ€ฒ (๐‘ž, (๐‘‹, ๐œƒ๐‘‡ (๐‘ž)), and ๐œƒ๐‘‡ : ๐‘„ โ†’ ๐’ด is the output function defined as ๐œƒ๐‘‡ (๐‘ž) = ๐‘Œ such that if ๐‘ž โˆˆ Win ๐‘–+1 (๐’ข) โˆ– Win ๐‘– (๐’ข) then โˆ€๐‘‹.๐›ฟ โ€ฒ (๐‘ž, (๐‘‹, ๐‘Œ )) โˆˆ Win ๐‘– (๐’ข). Step 5. Given a strategy in the form of a transducer ๐‘‡ , we can obtain an orchestrator that real- izes the specification. Let the extended transition function ๐›ฟ๐‘‡* of ๐‘‡ is ๐›ฟ๐‘‡* (๐‘ž, ๐œ–) = ๐‘ž and ๐›ฟ๐‘‡* (๐‘ž, ๐‘ค๐‘Ž) = ๐›ฟ๐‘‡ (๐›ฟ๐‘‡* (๐‘ž, ๐‘ค), ๐‘Ž). Then, for every sequence ๐‘ค of length ๐‘š โ‰ฅ 0 ๐‘ค = (๐‘‹0 , ๐‘Œ0 ) . . . (๐‘‹๐‘š , ๐‘Œ๐‘š ), where for each index ๐‘˜, ๐‘Œ๐‘˜ and ๐‘‹๐‘˜ are of the form (๐‘Ž๐‘˜ , ๐‘ž๐‘˜+1 , ๐‘œ๐‘˜ ) and ๐œŽ๐‘œ๐‘˜ ,๐‘˜ respectively, we define the orchestrator ๐›พ๐‘‡ ((๐œŽ10 . . . ๐œŽ๐‘›0 ), (๐œŽ11 . . . ๐œŽ๐‘œ1 ,1 . . . ๐œŽ๐‘›1 ), . . . (๐œŽ1๐‘š . . . ๐œŽ๐‘œ๐‘˜ ,๐‘š . . . ๐œŽ๐‘›๐‘š )) = (๐‘Ž๐‘š , ๐‘œ๐‘š ), where (๐‘Ž๐‘š , ๐‘ž๐‘š+1 , ๐‘œ๐‘š ) = ๐œƒ๐‘‡ (๐›ฟ๐‘‡* (๐‘ž0 , ๐‘ค)), and whenever the trace is successful, the next choice is โŠฅ. Hence, we can reduce the problem of service composition โ‹ƒ๏ธ€ for ltl๐‘“ goal specifications to solving the dfa game over ๐’œ๐œ™,๐’ž with uncontrollable symbols ๐’ณ = ๐‘– ฮฃ๐‘– and controllable symbols ๐’ด = ๐ด ร— ๐‘„ ร— {1, . . . , ๐‘›}. Proposition 2. ๐‘Ž0 . . . ๐‘Ž๐‘šโˆ’1 โˆˆ โ„’(๐’œ๐œ™ ) iff (๐‘Ž0 , ๐‘ž1 ) . . . (๐‘Ž๐‘šโˆ’1 , ๐‘ž๐‘š ) โˆˆ โ„’(๐’œact ), for some ๐‘ž1 . . . ๐‘ž๐‘š . Proposition 3. Let โ„Ž be a history with ๐’ž and ๐œ™ be a specification. Then, โ„Ž is successful iff there exist a word ๐‘ค โˆˆ ๐ดโ€ฒ* such that โ„Ž = ๐œ๐œ™,๐’ž (๐‘ค) and ๐‘ค โˆˆ โ„’(๐’œ๐œ™,๐’ž ). Proposition 3 shows that there is a tight relationship between accepting runs of ๐’œ๐œ™,๐’ž and successful histories with ๐’ž for specification ๐œ™. Theorem 4. Realizability for โŸจ๐œ™, ๐’žโŸฉ can be solved by checking whether ๐‘ž0โ€ฒ โˆˆ Win(๐’œ๐œ™,๐’ž ). Proof sketch. Soundness can be proved by induction on the maximum number of steps ๐‘– for which the controller wins the dfa game from ๐‘ž0โ€ฒ , building ๐›พ in a backward fashion such that it chooses (๐‘Ž๐‘˜ , ๐‘œ๐‘˜ ) โˆˆ ๐ดโ€ฒ that allows forcing the win in the dfa game (which exists by assumption). Completeness can be shown by contradiction, assuming that there exists an orchestrator ๐›พ that realizes ๐œ™ with community ๐’ž, but that ๐‘ž0โ€ฒ ฬธโˆˆ Win(๐’œ๐œ™,๐’ž ), implying that we can build an arbitrarily long unsuccessful history, by definition of winning region, contradicting that ๐›พ realizes ๐œ™. Theorem 5. Let the composition problem be โŸจ๐œ™, ๐’žโŸฉ, and let the transducer ๐‘‡ be a winning strategy over the game arena ๐’œ๐œ™,๐’ž . Let ๐›พ๐‘‡ be the orchestrator extracted from ๐‘‡ , as defined above. Then, ๐›พ๐‘‡ realizes ๐œ™ with community ๐’ž. Considering the cost of each step above, we get the following upper bound for the worst-case computa- tional cost (we conjecture this cost is the exact complexity characterization): Theorem 6. Problem 1 can be solved in at most exponential time in the size of the formula, in at most exponential time in the number of services, and in polynomial time in the size of the services. 5. Linear Encoding in FOND Planning In this section, we refine the solution from the previous section, taking advantage of the possibility of using FOND adversarial planning to handle the NFA corresponding to the goal symbolically in linear time. Specifically, we show how we can compute a PDDL specification that is linear in the size of the formula ๐œ™, by exploiting a technique for planning for ltl๐‘“ goals in [22]. Their construction is based on representing the nfa of the goal by a symbolic representation directly stemming from the corresponding afa, and then exploring (possibly partially) the nfa on-the-fly while planning. In contrast to [22], we do this in a nondeterministic (adversarial) setting, which normally would require using dfas instead of nfas, see [15]. This is possible in our case because the ltl๐‘“ goal specifies the desired sequence of target actions, whose actual choice is under the full control of the controller that we are synthesizing. Encoding in Adversarial FOND Planning. The game-theoretic solution technique presented in the previous section gives us the theoretical foundations for reasoning about the problem and is useful for theoretical analysis regarding the correctness and worst-case computational cost. However, the size of the game arena can be exponential in the size of the formula and exponential in the number of services, meaning that computing the entire arena beforehand may be infeasible. However, this is not required since we can build the arena on-the-fly while searching for a solution. We can do this by leveraging on FOND adversarial planning (aka strong planning in FOND), where the agent controls the actions and the environment controls the fluents [6]. In particular, we focus on planning for temporally extended goals [29, 30], work on declarative and procedural constraints [31, 32, 33], temporal logics with finite-trace semantics, such as ltl๐‘“ [32, 33, 14, 22], and in FOND planning [28, 15, 16]. More recently, [34] and [35] proposed, for classical and FOND planning, respectively, a polynomial-time compilation in PDDL for goals in Pure-Past Linear Temporal Logic (ppltl) [36]. In our setting, the orchestrator controls the actions to satisfy the ltl๐‘“ goal, while the evolution of the selected services is the uncontrollable part of the process. A forward search-based approach able to handle environment nondeterminism is, therefore, particularly interesting for our case since it could possibly avoid the exponential construction of the entire arena since the procedure stops as soon as a winning strategy is found. While the services are already given in input, the nfa ๐’œ๐œ™ must be computed from the input goal specification ๐œ™. To avoid constructing the entire nfa in advance, we will rely on an on-the-fly construction based on Torres and Baierโ€™s work [22]; more details later in this section. Another crucial feature of (FOND) planning is that the domain specification is assumed to be factorized, i.e., a state in the arena is a propositional assignment of a set of fluents โ„ฑ. This feature fits well in our setting since a state of the game arena (i.e. of the composition dfa ๐’œ๐œ™,๐’ž ) is made of several state components: one state component for each service, each keeping track of the current state for each service, and one component from the nfa state ๐’œ๐œ™ , which keeps track of the partial satisfaction of the goal formula ๐œ™. Regarding the latter, we consider the nfa states ๐‘„๐‘ as assignments of the states of the afa ๐‘„๐ด viewed as atomic propositions (i.e. ๐‘„๐‘ โІ 2๐‘„๐ด ) [14, 22]. As for the services ๐’ฎ๐‘– , they also could be represented in compact (i.e. logarithmic) form with fluents โ„ฑ๐‘– , i.e. ฮฃ๐‘– = 2โ„ฑ๐‘– . Indeed, one could always build a binary encoding of the state space ฮฃ๐‘– using log |ฮฃ๐‘– | bits. Adapting Torres and Baierโ€™s Construction Most of the techniques in FOND planning for temporal goals specify a PDDL encoding that takes a domain and a problem in PDDL with a temporal goal as input and generates new PDDL domain and problem files with a simple reachability goal. Such files can then be given in input to a (non-temporal) FOND planning solver, and the correctness of the encoding guarantees that, from a solution for the new version of the problem, we can compute a solution for the original planning problem. In our case, we adopt the encoding proposed in [22] for solving temporally-extended planning in deterministic domains. The interesting feature of their technique is that it makes it possible to encode in PDDL the construction of the nfa on-the-fly, by exploiting the relationship between afa ๐ด๐ด of the ltl๐‘“ formula ๐œ™ and its equivalent nfa ๐ด๐‘ . Not only is this a worst-case optimal construction in the size of the formula, but it could possibly avoid the entire construction of the nfa. The other alternatives are either superseded, not easily applicable, or worst-case non-optimal: [33]โ€™s translation is worst-case exponential but builds the entire nfa explicitly and only works for deterministic domains; and the encoding proposed in [16] requires the construction of the entire dfa, which is not optimal since its size is doubly exponential in the size of the formula. the encoding in [35] is designed for ppltl goals, and despite ppltl and ltl๐‘“ are equally expressive, translating one into the other (and vice versa) is generally prohibitive since the best-known algorithms require 3EXPTIME [36]. Although Torres and Baierโ€™s encoding was originally used for deterministic planning domains, we observe that their construction works even in the case of nondeterministic domains, with the restriction that the propositions ๐’ซ in the ltl๐‘“ goal specification are such that all are controllable by the agents and, therefore, not controllable by the environment. In fact, this is the case in our framework: we can rely on the nfa-based construction in the definition of the game arena since the nondeterminism from the nfa and the nondeterminism from the services do not directly interfere with each other. Note that if some of the propositions in the goal formula were not under the control of the agent (as is typically the case in temporally extended goals, which talk about sequences of fluent evaluations in the domain), we would end up mixing the angelic nondeterminism of the nfa with the devilish nondeterminism of the environment, and to avoid it we need determinize the nfa getting a dfa where the angelic nondeterminism has been removed. As a consequence, the complexity of planning in this case becomes 2EXPTIME-complete [15]. There are two minor issues to cope with when using Torres and Baierโ€™s encoding in our setting: (i) first, the formulas can only be evaluated over state fluents and not over actions; (ii) second, the compilation does not handle nondeterminism. In the next part, we describe how to solve these issues. First, we define the service community domain ๐’Ÿ๐’ž = โŸจโ„ฑ โ€ฒ , ๐ดโ€ฒ , pre โ€ฒ , eff โ€ฒ โŸฉ and problem ฮ“๐’ž = โŸจ๐’Ÿ๐ถ , ๐‘ โ€ฒ0 , ๐บโ€ฒ โŸฉ, where: (i) โ„ฑ โ€ฒ = {start} โˆช ๐ด โˆช ฮฃ1 โˆช ยท ยท ยท ฮฃ๐‘› ; (ii) ๐ดโ€ฒ = ๐ด ร— {1 . . . ๐‘›}; (iii) pre โ€ฒ (โŸจ๐‘Ž, ๐‘–โŸฉ) = true (since ๐›ฟ๐‘– are total functions); (iv) eff โ€ฒ (โŸจ๐‘Ž, ๐‘–โŸฉ) = {eff โ€ฒ๐‘Ž,๐‘– (๐œŽ๐‘–1 , ๐œŽ๐‘–2 ) | ๐œŽ๐‘–2 โˆˆ ๐›ฟ๐‘– (๐œŽ๐‘–1 , ๐‘Ž)} where eff โ€ฒ๐‘Ž,๐‘– (๐œŽ๐‘–1 , ๐œŽ๐‘–2 ) = {{๐œŽ๐‘–1 } โ— {ยฌ๐œŽ๐‘–1 , ๐œŽ๐‘–2 , ๐‘Ž} โˆช ๐ด ยฏ (๐‘Ž)}; (v) ๐‘ โ€ฒ = {start} โˆช {๐œŽ๐‘–0 | ๐œŽ๐‘–0 is the initial state of ๐’ฎ๐‘– }; (vi) ๐บโ€ฒ = 0 {(๐œŽ1 , . . . , ๐œŽ๐‘› ) | ๐œŽ๐‘– โˆˆ ๐น๐‘– for all ๐‘– = 1, . . . ๐‘›}. Intuitively, ๐’Ÿ๐’ž simulates executions of ๐’ž. The state induced by the fluents โ„ฑ โ€ฒ is represented as tuples of the form (๐œŽ1 , . . . , ๐œŽ๐‘› , ๐‘Ž), where ๐œŽ๐‘– โˆˆ ฮฃ๐‘– and ๐‘Ž โˆˆ ๐ด. The action space ๐ดโ€ฒ is a set of pairs of the form (๐‘Ž, ๐‘–), where ๐‘Ž is the chosen action and ๐‘– is the index of the chosen service that should execute it. The preconditions are not needed (though conditional effects are) since we assumed the servicesโ€™ transition functions are complete. The effect of action (๐‘Ž, ๐‘–) are all the possible state pairs for which there is a transition via ๐‘Ž, but only the effects from the same current state ๐œŽ๐‘–1 can be triggered (see the condition), and all the successor states ๐œŽ๐‘–2 โˆˆ ๐›ฟ๐‘– (๐œŽ๐‘–1 , ๐‘Ž) are considered (see the set comprehension for terms eff โ€ฒ๐‘Ž,๐‘– (๐œŽ๐‘–1 , ๐œŽ๐‘–2 )). Crucially, the action (๐‘Ž, ๐‘–) also has the effect of adding the fluent ๐‘Ž in the next state and removing all other action fluents; such negative effects are denoted with ๐ด ยฏ (๐‘Ž) = {ยฌ๐‘Žโ€ฒ | ๐‘Žโ€ฒ โˆˆ ๐ด, ๐‘Žโ€ฒ ฬธ= ๐‘Ž}, hence forcing the declare assumption at the semantic level. The initial state contains the auxiliary fluent start, which will be used to shift by one step the evaluation of the formula; intuitively, this is because the state-based evaluation starts from the initial state, where no action is taken yet. The auxiliary fluent start and the presence of the last action taken in the state allow us to solve the first issue. Finally, the set of goal states ๐บโ€ฒ corresponds to the set of servicesโ€™ configurations where all the states are final. The next step is to apply the translation rules specified in [22]. Given a problem instance โŸจ๐œ™, ๐’žโŸฉ, and given the domain ๐’Ÿ๐’ž and problem ฮ“๐’ž as defined above, we get a new domain ๐’Ÿ๐œ™,๐’ž and problem ฮ“๐œ™,๐’ž by applying the translation rules with the formula ๐œ™โ€ฒ = start โˆง โˆ˜๐œ™. Intuitively, the evaluation of ๐œ™โ€ฒ will read the initial state and, from there on, will evaluate the chosen actions, which, by construction, are added to the subsequent states. To support nondeterminism, and so to fix the second challenge, the only change we apply is that, in โ€œWorld Modeโ€ (cfr. [22]), we consider all possible nondeterministic effects coming from ๐’Ÿ๐’ž , i.e. eff (๐‘Žโ€ฒ ) = {๐ธ โˆช {copy, ยฌworld} | ๐ธ โˆˆ eff (๐‘Žโ€ฒ )}. Theorem 7. Let โŸจ๐œ™, ๐’žโŸฉ be an instance of ltl๐‘“ -goal oriented service composition. ๐œ™ is realizable with ๐’ž iff there exists a strong solution for the FOND adversarial problem ฮ“๐œ™,๐’ž . Proof sketch. By construction of ฮ“๐œ™,๐’ž , and by correctness of Torres and Baierโ€™s construction, there is a one-to-one correspondence between traces of ฮ“๐œ™,๐’ž and the game arena ๐’œ๐œ™,๐’ž (modulo deletion of the first evaluation of start). In other words, the game arena induced by ๐’Ÿ๐œ™,๐’ž and ฮ“๐œ™,๐’ž is essentially the same of the one induced by ๐’œ๐œ™,๐’ž . Therefore, there exist a strong solution for ฮ“๐œ™,๐’ž iff there exist a strong solution for ๐’œ๐œ™,๐’ž . The claim follows by Theorem 4. Theorem 8. ltl๐‘“ goal-oriented service composition can be solved in at most exponential time in the size of the formula, in at most exponential time in the number of services, and in polynomial time in the size of services (exponential if the service representation is in logarithmic form). We observe that in the construction of ๐’Ÿ๐œ™,๐’ž we introduce action fluents, hence increasing the size of the state space. This could be avoided by modifying Torres and Baierโ€™s encoding with PDDL3 action-trajectory constraints [37], and hence by evaluating the temporal goal over the action-trajectory only. A detailed analysis of this option is left as future work. 6. Implementation and Applications We implemented a software prototype to solve the composition problem, and we tested it on industrial case studies taken from the literature. The code can be found at https://bit.ly/3XjJbEF. The tool. Our tool takes in input a list of services (in explicit representation) and a ltl๐‘“ goal specification and computes a PDDL specification (i.e. domain and problem files) of the corresponding FOND planning task, using the technique formalized in the previous section. First, we construct ๐’Ÿ๐’ž and ฮ“๐’ž in PDDL form. The PDDL domain file represents the nondeterministic behaviour of the services. One of the challenges we encountered was that some planning systems do not support the when expression with complex effect types, such as oneof; this prevented us from specifying the transitions as a list of when expressions, one for each possible starting state, each followed by a oneof expression that includes all the possible successors. To workaround this issue, given an action โŸจ๐‘Ž, ๐‘–โŸฉ of ๐’Ÿ๐’ž , we defined a PDDL operator โŸจ๐‘Ž, ๐‘–, ๐œŽ๐‘–๐‘— โŸฉ, one for each possible starting state ๐œŽ๐‘–๐‘— โˆˆ ฮฃ๐‘– of service ๐‘–; In this way, we can use the oneof effect without nesting it into a when expression. Then, to include the on-the-fly evaluation of the ltl๐‘“ goal specification ๐œ™, we rely on the [22]โ€™s translator (in the following, denoted with TB), implemented in SWI-Prolog, to include the on-the-fly evaluation of the ltl๐‘“ goal in the planning domain. The encoding of the goal formula in PDDL follows the syntax supported by the TB translator. The TB translator supports four modes: Simple, OSA, PG, and OSA+PG, where Simple is the โ€œnaiveโ€ translation (cfr. [22], Section 4) and OSA, OSA+PG are two optimizations called โ€œOrder for Synchronization Actionโ€ and โ€œPositive Goalsโ€ (cfr. ibid., Section 4.3). OSA+PG is the combination of OSA and PG. The final (:goal) section includes, in conjunction: (i) the goal specified by the TB encoding, and (ii) the formula that specifies the accepting configuration for all services. Regardin (ii), the PDDL formula for the accepting service configuration has the form (and (or (curstate_s1 ๐œŽ11 )(curstate_s1 ๐œŽ12 ) . . . ) . . . (or (curstate_sn ๐œŽ๐‘›1 ) (curstate_sn ๐œŽ๐‘›2 ) . . . )), for all services ๐’ฎ๐‘– with ๐‘– = 1 . . . ๐‘›, and ๐œŽ๐‘–๐‘— โˆˆ ๐น๐‘– . Intuitively, the formula captures the condition that for each service, it holds that in the current planning state each service is in either one of its final states. Note that we could have encoded the final acceptance condition by means of the formula โ—‡(๐œ‘ โˆง โˆ™true), where ๐œ‘ is a propositional formula, which is the formula that is accepting whenever, in the current state of the trace, ๐œ‘ is true. However, this would have burdened the TB translator with a larger ltl๐‘“ formula, ending up in enlarging the overhead of the encoding (i.e. more sync actions, more nfa states, etc.). Case Studies. To test our tool, we considered case studies inspired by the literature on service composition applied to the Smart Manufacturing and Digital Twins industry. Electric Motor (EM). We consider a simplified version of the electric motor assembly, proposed in the context of Digital Twins composition for Smart Manufacturing [38]. The main compo- nents of an electric motor are the stator, the rotor, and, in the case of alternate current motors with direct current power (e.g., in the case of electric cars), the inverter. These three compo- nents are built or retrieved in any order, but the final assembly step must have all the previous components available. Moreover, after the assembly step, it is required that at least one test between an electric test and a full static test must be performed. This goal is captured by the following ltl๐‘“ constraints: โ—‡(assembleMotor ) โˆง (ยฌassembleMotor ๐’ฐ buildStator ) โˆง (ยฌassembleMotor ๐’ฐ buildRotor ) โˆง (ยฌassembleMotor ๐’ฐ buildInverter ) โˆง โ—‡(staticTest โˆจ electricTest) โˆง (ยฌelectricTest ๐’ฐ assembleMotor ) โˆง (ยฌstaticTest ๐’ฐ assembleMotor ). The ๐’ฐ - clauses prevent the assembly step before all the components are available, while the reachability goal is specified by โ—‡(assembleMotor ) and โ—‡(staticTest โˆจ electricTest). We consider two types of services: Electric motor scenario Chip Production scenario (deterministic) Chip Production scenario (nondeterministic) Chip Production scenario (unsolvable) 1000 1000 1000 1000 Heuristic Heuristic Heuristic Heuristic hmax hmax hmax hmax hff hff hff hff 800 800 800 800 Encoding Encoding Encoding Encoding Simple Simple Simple Simple OSA OSA OSA OSA Planning Time (PT) Planning Time (PT) Planning Time (PT) Planning Time (PT) PG PG PG PG 600 600 600 600 OSA+PG OSA+PG OSA+PG OSA+PG 400 400 400 400 200 200 200 200 0 0 0 0 e0 e1 e2 e3 e4 e5 e6 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 cn1 cn2 cn3 cn4 cn5 cn6 cn7 cn8 cn9 cn10 cn11 cn12 cu1 cu2 cu3 cu4 cu5 cu6 cu7 cu8 cu9 cu10 cu11 cu12 #breakable services length of formula length of formula length of formula (a) PT metric for the EM (b) PT metric for the CP sce- (c) PT metric for the CP sce- (d) PT metric for the CP sce- scenario. nario (det). nario (nondet). nario (unsolvable). infallible and breakable (Figure 2). The former has only one accepting state and supports one operation op; the latter, when executing the operation, can nondeterministically go into a โ€œbrokenโ€ state, from which a repair action is required to make it available again. In our experiments, we will have exactly one service for each process action, and scale on the number of breakable services. Chip Production (CP). Here we consider a smart factory scenario in which the goal is to pro- op op op duce chips [8]. In our simplified setting, the op op goal specification consists of a sequence of ๐‘ž0 ๐‘ž0 ๐‘ž1 ๐‘ž0 ๐‘ž1 operations to be performed: cleaning the sil- repair icon wafers, thin film deposition, resist coat- Figure 2: The infallible, breakable and irreparable ing, etc. We consider three variants of this sce- services templates, respectively. nario, one for each service type. In particular, in the first variant, all services are of type infallible; in the second variant, they are of type break- able; and in the third variant, the services are all of type irreparable, i.e. they are like the break- able services except that they cannot be repaired. The ltl๐‘“ goal specification is a sequential goal with the following actions: cleaning, filmDeposition, resistCoating, exposure, development, etching, impuritiesImplantation, activation, resistStripping, assembly, testing, and packaging. Hence, the formula has the following form: โ—‡(cleaning โˆง ยฌfilmDeposition โˆง ยท ยท ยท โˆง ยฌfilmDeposition โˆง . . . ). Note that at each step we negate the presence of all the other. Moreover, for all variants, we will have exactly one service for each process action. We use the number of actions as a scaling parameter. Evaluation. We evaluated the MyND Planner [39], combined with the โ„Žff and โ„Žmax heuristics, over the PDDL files produced by our tool and the 4 available encoding of the TB translator. The metrics we considered are: pre-processing time, i.e., translation and SAS computation (TT), planning time (PT), the number of nodes expanded during the search (EN), and the policy size (PS). As benchmarks, we considered: ๐‘’๐‘– , with ๐‘– = 0, . . . , 6, are instances of the electric motor scenario with ๐‘– builder services breakable and 6 โˆ’ ๐‘– infallible; ๐‘๐‘– are the instances on the chip production scenario, with ๐‘– = 1, . . . , 12 being the length of the sequence of operations, with all services infallible; ๐‘๐‘›๐‘– as ๐‘๐‘– but with all services breakable; and ๐‘๐‘ข๐‘– as ๐‘๐‘– but with all services irreparable. We set a timeout of 1000 seconds (โ‰ˆ 15 minutes). Due to lack of space we discuss only the PT metric. Other results can be found in the appendix. Platform. The experiments have been run on an Ubuntu 22.04 machine, endowed with 12th Gen Intel(R) Core(TM) i7-1260P, with 16 CPU threads (12 cores) and 64GB of RAM. The JVM version is 14 for compatibility with MyND. The maximum RAM for the JVM was 16GB. Results. The results of our evaluation on all benchmarks, both using โ„Žmax and โ„Žff , are shown in the plots for the PT metric for each scenario: Figure 3a, 3b, 3c, and 3d. For the EM scenario (Figure 3a), we observe that the PT of the OSA encoding is generally lower than the others, and in particular โ„Žmax is slightly better in PT than โ„Žff . The Simple encoding has comparable but slightly higher PT. The PG encoding has the PT higher than the PT in the Simple and OSA case, for all instances, sometimes higher by a factor of 4-5. The OSA+PG encoding was considerably worse than the other since the evaluation on many instances reached the timeout. In the deterministic CP scenario (Figure 3b), we have that the OSA encoding, with respect to the PT metric, is better than the others, with no strict dominance between โ„Žmax and โ„Žff . The other encodings, from better to worse, were OSA+PG, PG and Simple for the โ„Žff heuristic, and Simple, PG and OSA+PG for the โ„Žmax heuristic. In the nondeterministic CP scenario (Figure 3c), the performances were quite worse than the deterministic case for all encodings and heuristics; the executions from cn8 on timed out. We noticed a certain advantage of using โ„Žff with the PG encoding, and โ„Žmax with the OSA encoding. In the unsolvable CP scenario (Figure 3d), the OSA was considerably better than the other encodings, with comparable performances between โ„Žmax and โ„Žff . 7. Discussion and Conclusion In this paper, we have studied an advanced form of task-oriented compositions of nondeterministic services. Our goal is to synthesize an orchestrator that, using the available services, produces a trace that satisfies an ltl๐‘“ specification. To underline the importance of the ever-increasing use of service composition in smart manufacturing, we evaluate two valid case studies taken from the literature: one concerning the production of an electric motor and the other concerning the production of chips. The tool prototype we implemented shows the feasibility of our approach. It would be interesting to test other encodings for temporal goals, such as [16] and [34, 35]. This work is highly motivated by a renovated interest in service composition techniques with impactful applications in smart manufacturing [40, 41, 42]. In particular, the use of service composition has been advocated in an industrial scenario [10], where the composition of a target service (manufacturing goal) is managed by means of a community of Digital Twins (manufacturing actors) modelled as stochastic services. References [1] J. McGovern, S. Tyagi, M. Stevens, S. Mathew, Java web services architecture, 2003. [2] D. Berardi, D. Calvanese, G. De Giacomo, M. Lenzerini, M. Mecella, Automatic composition of e-services that export their behavior, in: ICSOC, 2003. [3] D. Berardi, D. Calvanese, G. De Giacomo, M. Mecella, Composition of services with nondetermin- istic observable behavior, in: ICSOC, 2005. [4] G. De Giacomo, M. Mecella, F. Patrizi, Automated service composition based on behaviors: The Roman model, in: Web services foundations, Springer, 2014. [5] A. Cimatti, M. Pistore, M. Roveri, P. Traverso, Weak, strong, and strong cyclic planning via symbolic model checking, Artif. Intell. (2003). [6] H. Geffner, B. Bonet, A Concise Introduction to Models and Methods for Automated Planning, M&C Publishers, 2013. [7] G. De Giacomo, F. Patrizi, S. Sardiรฑa, Automatic behavior composition synthesis, Artif. Intell. (2013). [8] F. Monti, L. Silo, F. Leotta, M. Mecella, On the suitability of AI for service-based adaptive supply chains in smart manufacturing, in: ICWS, 2023. [9] F. Monti, L. Silo, F. Leotta, M. Mecella, Services in smart manufacturing: Comparing automated reasoning techniques for composition and orchestration, in: SOC, 2023. [10] G. De Giacomo, M. Favorito, F. Leotta, M. Mecella, L. Silo, Digital twin composition in smart manufacturing via markov decision processes, Comput. Ind. (2023). [11] M. Pesic, H. Schonenberg, W. M. Van der Aalst, Declare: Full support for loosely-structured processes, in: EDOC, 2007. [12] C. Di Ciccio, M. Montali, Declarative Process Specifications: Reasoning, Discovery, Monitoring, Springer, 2022. [13] M. Dumas, F. Fournier, L. Limonad, A. Marrella, M. Montali, J. Rehse, R. Accorsi, D. Calvanese, G. De Giacomo, D. Fahland, A. Gal, M. L. Rosa, H. Vรถlzer, I. Weber, Ai-augmented business process management systems: A research manifesto, ACM Trans. Manag. Inf. Syst. (2023). [14] G. De Giacomo, M. Y. Vardi, Linear temporal logic and linear dynamic logic on finite traces, in: IJCAI, 2013. [15] G. De Giacomo, S. Rubin, Automata-theoretic foundations of FOND planning for ltl๐‘“ and ldl๐‘“ goals, in: IJCAI, 2018. [16] A. Camacho, M. Bienvenu, S. A. McIlraith, Towards a unified view of AI planning and reactive synthesis, in: ICAPS, 2019. [17] V. Fionda, G. Greco, LTL on finite and process traces: Complexity results and a practical reasoner, J. Artif. Intell. Res. 63 (2018) 557โ€“623. [18] E. Sirin, B. Parsia, D. Wu, J. A. Hendler, D. S. Nau, HTN planning for web service composition using SHOP2, J. Web Semant. (2004). [19] J. Alves, J. Marchi, R. Fileto, M. A. R. Dantas, Resilient composition of web services through nondeterministic planning, in: ISCC, 2016. [20] M. Pistore, A. Marconi, P. Bertoli, P. Traverso, Automated composition of web services by planning at the knowledge level, in: IJCAI, 2005. [21] G. De Giacomo, M. Favorito, L. Silo, Composition of stochastic services for ltlf goal specifications, in: FoIKS, 2024. [22] J. Torres, J. A. Baier, Polynomial-time reformulations of LTL temporally extended goals into final-state goals, in: IJCAI, 2015. [23] A. Marrella, M. Mecella, S. Sardiรฑa, Supporting adaptiveness of cyber-physical processes through action-based formalisms, AI Commun. (2018). [24] A. K. Chandra, D. Kozen, L. J. Stockmeyer, Alternation, J. ACM (1981). [25] M. Y. Vardi, An automata-theoretic approach to linear temporal logic, 1995. [26] G. Rรถger, F. Pommerening, M. Helmert, Optimal Planning in the Presence of Conditional Effects: Extending LM-Cut with Context Splitting, in: ECAI, 2014. [27] N. Yadav, S. Sardina, Decision theoretic behavior composition, in: AAMAS, 2011. [28] G. De Giacomo, M. Y. Vardi, Synthesis for LTL and LDL on finite traces, in: IJCAI, 2015. [29] F. Bacchus, F. Kabanza, Planning for temporally extended goals, Ann. Math. Artif. Int. (1998). [30] F. Bacchus, F. Kabanza, Using temporal logics to express search control knowledge for planning, Artif. Intell. (2000). [31] J. A. Baier, C. Fritz, M. Bienvenu, S. A. McIlraith, Beyond classical planning: Procedural control knowledge and preferences in state-of-the-art planners, in: AAAI, 2008. [32] J. A. Baier, S. A. McIlraith, Planning with first-order temporally extended goals using heuristic search, in: AAAI, 2006. [33] J. A. Baier, S. A. McIlraith, Planning with temporally extended goals using heuristic search, in: ICAPS, 2006. [34] L. Bonassi, G. De Giacomo, M. Favorito, F. Fuggitti, A. E. Gerevini, E. Scala, Planning for temporally extended goals in pure-past linear temporal logic, in: ICAPS, 2023. [35] L. Bonassi, G. De Giacomo, M. Favorito, F. Fuggitti, A. E. Gerevini, E. Scala, FOND planning for pure-past linear temporal logic goals, in: ECAI, 2023. [36] G. De Giacomo, A. Di Stasio, F. Fuggitti, S. Rubin, Pure-past linear temporal and dynamic logic on finite traces, in: IJCAI, 2020. [37] L. Bonassi, A. E. Gerevini, E. Scala, Planning with qualitative action-trajectory constraints in PDDL, in: IJCAI, 2022. [38] G. De Giacomo, M. Favorito, F. Leotta, M. Mecella, F. Monti, L. Silo, AIDA: A tool for resiliency in smart manufacturing, in: CAiSE Forum, 2023. [39] R. Mattmรผller, Informed progression search for fully observable nondeterministic planning, Ph.D. thesis, 2013. [40] T. Catarci, D. Firmani, F. Leotta, F. Mandreoli, M. Mecella, F. Sapio, A conceptual architecture and model for smart manufacturing relying on service-based digital twins, in: ICWS, 2019. [41] G. De Giacomo, P. Felli, B. Logan, F. Patrizi, S. Sardiรฑa, Situation calculus for controller synthesis in manufacturing systems with first-order state representation, Artif. Intell. (2022). [42] G. De Giacomo, M. Y. Vardi, P. Felli, N. Alechina, B. Logan, Synthesis of orchestrations of transducers for manufacturing, in: AAAI, 2018.