=Paper=
{{Paper
|id=Vol-1846/paper6
|storemode=property
|title=Complexity Aspects of Web Services Composition
|pdfUrl=https://ceur-ws.org/Vol-1846/paper6.pdf
|volume=Vol-1846
|authors=Karima Ennaoui,Lhouari Nourine,Farouk Toumani
|dblpUrl=https://dblp.org/rec/conf/apn/EnnaouiNT17
}}
==Complexity Aspects of Web Services Composition==
Complexity Aspects of Web Services Composition Karima Ennaoui, Lhouari Nourine, and Farouk Toumani LIMOS, CNRS, Universite Clermont Auvergne 1 Rue de la Chebarde, 63178 AUBIERE CEDEX Abstract. The web service composition problem can be stated as fol- lows: given a finite state machine M , representing a service business protocol, and a set of finite state machines R, representing the business protocols of existing services, the question is to check whether there is a simulation relation between M and the shuffle product closure of R. In fact the shuffle product is a subclass of the communication free petri net and basic parallel processes, for which the same problem of simulation is known to be 2-Exptime-hard. This paper studies the impact of several parameters on the complexity of this problem. We show that the problem is Exptime-complete if we bound either: (i) the number of instances of services in R that can be used in a composition, or(ii) the number of instances of services in R that can be used in parallel, or (iii) the number of the so-called hybrid states in the finite state machines of R by 2. 1 Introduction Web Services [2] is a new computing paradigm that tends to become a technology of choice to facilitate interoperation among autonomous and distributed appli- cations. The UDDI consortium defines Web services as self-contained, modular business applications that have open, Internet-oriented, standards-based inter- faces. Several models have been proposed in the literature to describe different facets of services. In particular, the importance of specifying external behaviour of services, also called service business protocols, has been highlighted in several research works [6,3,4]. Through literature, different models have been used to represent web service business protocols. The Finite States Machines (FSM) for- malism is widely adopted in this context to model statefull applications exposed as web services where states represent the different phases that a service may go through while transitions represent “abstract” activities that a service can perform [6,3,4]. We consider in this paper the problem of Web Service Composition (WSC). This problem arises from the situation where none of the existing services can provide a requested functionality. In this case, the idea is to find out, algorith- mically, if the target functionality could be composed out of the existing services (components repository). This automatic approach of composition simplifies the development of software by reusing existing components and offers capabilities to customize complex systems built on the fly [9]. We focus more particularly on 86 PNSE’17 – Petri Nets and Software Engineering a specific instance of WSC, namely the (business) protocol synthesis problem, which can be stated as follows: given a set of business protocols of available ser- vices and given a business protocol of a target service, is it possible to synthesize automatically a mediator that implements the target service using the existing ones? [16] shows that when business protocols are described by means of FSMs, the WSC problem can then be formalized as the problem of deciding whether there exists a simulation relation between the target protocol and the shuffle (or asynchronous product) of the available ones. This result is however based on the implicit assumption that at most one instance of each available service can be used in a composition. This setting has been extended in [9] to the case where the number of instances that can be used in a composition is unbounded. WSC is formalized in this latter case as a simulation problem between an FSM and an infinite state machine, called Product Closure State Machine (PCSM), that is able to compute the shuffle closure of an FSM. Shuffle product of FSMs (and PCSM) is a subclass of Basic Parallel Processes (BPP), the class of communication free petri nets: every transition has at most one input place. Simulation of FSM by BPP was proven Expspace-hard by Lasota [15] and 2-Exptime-hard in [8]. Complexity analysis of WSC was first considered by Musholl et al.[16], under the aforementioned implicit assumption, where it is shown Exptime-Complete. In case of unbounded instances, the WSC problem has been proved decidable with an Ackermanian function as upper bound in [9]. The proof of [9] is based on Dickson lemma, and hence cannot be exploited to derive tighter upper bounds. An Expspace-hard lower bound is given by Lasota[15]. The source of complexity derived from the analysis of the algorithm given in [9] is related to the presence of the so-called hybrid states (final states with outgoing transitions and correspond to unbounded places in Petri net terminology) in the components and loops in the target: if the target FSM is loop free, the WSC problem becomes NP-complete and when the components are hybrid state free the problem is proven Exptime. In this paper, we consider additional parameters related to bounded/unbounded web services composition. We consider as inputs an FSM M (the target protocol) and a set of FSMs R (the protocols of the available services) and we investigate the complexity of testing simulation between M and the shuffle closure of R, represented as a PCSM [9]. More precisely, we study the complexity of the following problems: 1. W SC(M, R): The problem of composing M using an unbounded number of instances of R. 2. BC(M, R, k): The problem of composing M using at most k instances of each FSM in R. 3. P BC(M, R, k): The problem of composing M using simultaneously at most k F SM instances in R (in parallel). 4. U CHS(M, R, k) : The problem of composing M using an unbounded number of instances of R, with the number of hybrid states in R is bounded by k ∈ {0, 1, 2}. Ennaoui et.al.: Complexity Aspects of Web Services Composition 87 Table in figure 1 displays known and new complexity results regarding the WSC problem. M Acyclic FSM general FSM BC(M, R, 1) NP-complete[9] Exptime-complete [16] BC(M, R, k) NP-complete[9] Exptime-complete P BC(M, R, 1) Polynomial Polynomial P BC(M, R, k) NP-complete Exptime-complete W SC(M, R) NP-complete [9] Decidable [9] U CHS(M, R, 0) NP-complete[9] Exptime-complete U CHS(M, R, 1) NP-complete[9] Exptime-complete U CHS(M, R, 2) NP-complete[9] Exptime-complete Fig. 1. Complexity results of WSC sub-problems Paper organisation Section 2 recalls some basic definitions needed in this paper. In section 3, we investigate the problem of bounded web services composition and proves that it is Exptime-Complete. Next, we define web service composition with fixed number of parallel instances, and show that it is Exptime-Complete in general and is NP-complete when M is loop free and polynomial for k = 1. In section 5, we consider the web service composition when the number of hybrid states is bounded. We show that this problem is Exptime-Complete for k = 0, k = 1 and k = 2. We conclude in section 6. 2 Preliminaries Finite State Machine We consider in this paper service business protocols for- mally described as FSMs. We recall below the definition of such machines. Definition 1. (Finite State Machine (FSM)) 0 A State Machine (SM) M is a tuple M = (ΣM , QM , FM , qM , δM ), where: ΣM is a finite alphabet, QM is a set of states, δM ⊆ QM × ΣM × QM is a set of 0 labelled transitions, FM ⊆ QM is a set of final states, and qM ∈ QM is the initial state. If QM is finite then M is called a Finite State Machine (FSM). Moreover, a state q ∈ QM is called: intermediate, if q ∈ / FM and ∃p1 , p2 ∈ QM , s.t (p1 , a, q) ∈ δM and (q, b, p2 ) ∈ δM , we denote by I(M ) the set of intermediate states of M ; hybrid, if q ∈ FM , q 6= q0 and there exist at least one transition (q, b, p) ∈ δM , with p ∈ QM and b ∈ Σ, the set of hybrid states is denoted H(M ) and terminal, if q ∈ FM and is not hybrid. We define the norm of a state q as the finite length of the shortest path from q to a final state. The norm of an FSM M , noted norm(M), is the maximal norm of its states. 88 PNSE’17 – Petri Nets and Software Engineering k-Iterated Product Machine (k-IPM) and Product State Machine (PCSM) We start by defining the shuffle (asynchronous product) and union operations on FSMs: Definition 2. (Asynchronous product and Union of two FSMs) 0 Let M = (ΣM , QM , FM , qM , δM ) and M 0 = (ΣM 0 , QM 0 , FM 0 , qM 0 0 , δM 0 ) be two FSMs. We have : – The shuffle or asynchronous product of M and M 0 , denoted M × M 0 , 0 0 is an FSM (ΣM ∪ ΣM 0 , QM × QM 0 , FM × FM 0 , (qM , qM 0 ), λ) where the tran- sition function λ is defined as follows: λ = {((q, q ), a, (q1 , q1 0 )) : ((q, a, q1 ) ∈ 0 δM and q 0 = q10 ) or ((q 0 , a, q1 0 ) ∈ δM 0 and q = q1 )}. – The union of M and M 0 , denoted M ∪ M 0 , is the FSM (ΣM ∪ ΣM 0 ∪ {}, 0 0 QM ∪ QM 0 ∪ {q0 }, FM ∪ FM 0 , q0 , δM ∪ δM 0 ∪ {(q0 , , qM ), (q0 , , qM 0 )}). For a set of available FSMs R = {M1 , ...Mi }, we consider a compact structure that abstracts all possible executions that can be produced using the components of R. First, we begin by the simple case where each Mj can be used only once: Definition 3. (Union of asynchronous products of FSMs set) Let R = {M1 ....Mm } be a FSMs repository. We define (R) the S union of asynchronous product of all the subsets of R as the FSM: (R) = {Mi ,...,Mi }∈2R (Mi1 × 1 j ... × Mij ) where j ∈ [0, i]. Second, we consider the case where the number of copies of each Mj ∈ R is bounded by an integer k: Definition 4. (k-iterated product of FSMs set R) The k-iterated product of R is defined by R⊗k = R⊗k−1 × (R) with R⊗1 = (R). Finally, we consider the general case where the number of instances of each Mj ∈ R is unbounded. This corresponds to the product closure of R [9]: Definition 5. (Product closureS+∞of FSMs set) The product closure of R, noted R⊗ , is defined as: R⊗ = i=0 R⊗i . The Product Closure Machine (PCSM) of R, defined in [9] and proven equivalent to R⊗ , is the SM with unbounded number of tokens stacked at the beginning in the initial states in R. Then, the instantaneous description of a PCSM gives the number of tokens (instances) at each state of its underlaying FSMs. This description is called a configuration of R⊗ . We omit from this de- scription the initial states (source:infinite number of tokens) and terminal states (sink:terminated instances). Example 1. Figure 2 illustrates the execution of the sequence ”abca” by the PCSM of the FSM M in figure 2-(a). M contains one intermediate state q1 and two hybrid states q2 and q5 . Therefore, figure 2-(b) depicts a part of the PCSM M ⊗ with triplets as configurations where integers witness respectively Ennaoui et.al.: Complexity Aspects of Web Services Composition 89 the number of tokens in q1 , q2 and q5 . For each configuration c in figure 2- (b), we associate an instant t (or several instants) during the execution when c describes the PCSM. At the beginning (t = 0), M ⊗ ’s instantaneous description is (0, 0, 0), interpreting an empty stack in every state of M , except the initial state with an infinite number of tokens (figure 2-(c)). To execute the transition (q0 , a, q1 ), a token is moved from q0 to q1 in figure 2-(d), corresponding to the configuration (1, 0, 0) in instant t = 1. In t = 2, the executed transition (q0 , b, q4 ) corresponds to moving a token from the initial state to a terminal one q4 (figure 2-(e)). Since the instantaneous description does not consider neither initial states nor terminal ones, then the configuration stays the same as the previous instant. Notice that this move corresponds to both creating and terminating an instance of the FSM. Then, the transition (q0 , c, q5 ) is executed by moving a token from q0 to the hybrid state q5 . This creates a new instance implying, in this case, an increase in the number of simultaneously used instances in the execution. This is depicted in figure 2-(f). Finally, a token is moved from the state q1 to q2 in figure 2-(g), in order to execute the transition (q1 , a, q2 ). It changes M ⊗ ’s instantaneous description in t = 4 into (0, 1, 1) which is a final configuration (i.e (0, 1, 1) ∈ FC ) since all tokens in the PCSM are in final states (either hybrid or terminal). Formally, we define the PCSM R as the SM (ΣR , CR⊗ , FC , c0 , ΦR⊗ ), where: S 1. ΣR = Mj ∈R ΣMj ; 2. CR⊗ is the set of states (also called configurations of R⊗ ). CR⊗ ⊂ Nn , with: n = nI (R) + nS H (R) with: nI (R) is the number of intermediate states in R) (nI (R) = | Mj ∈R I(Mj )|) and nH (R) is the number of hybrid states in R) (nH (R) = | Mj ∈R H(Mj )|). For each configuration c, c[m] (the mth S component of c) is called a witness of a unique state qm ∈ QMj for some j. Note that: – qm is an intermediate state, if 1 ≤ m ≤ nI (R); – qm is an hybrid state, if nI (R) + 1 ≤ m ≤ n. In an abuse of notation, we use c[m] and c[qm ] interchangeably. 3. FC is the set of final states. FC = {c ∈ CR⊗ |c[m] = 0, for each: 1 ≤ m ≤ nI (R)}; 4. c0 = {0}n is the initial state of R⊗ ; 5. ΦR⊗ ⊆ CR⊗ × ΣR × CR⊗ is the set of transitions. we have (c1 , a, c2 ) ∈ ΦR⊗ iff : – there exists (q0 , a, q) ∈ QMj , such that: q0 is the initial state of Mj and c2 [q] = c1 [q] + 1 and c2 [p0 ] = c1 [p0 ] for each p0 6= q. – there exists (p, a, q) ∈ QMj , such that: c2 [p] = c1 [p] − 1, c2 [q] = c1 [q] + 1 and c2 [p0 ] = c1 [p0 ] for each p0 6= p, q. – there exists (p, a, q) ∈ QMj , such that: q is a final state or the initial state, c2 [p] = c1 [p] − 1 and c2 [p0 ] = c1 [p0 ] for each p0 6= p. Simulation preorder We recall below the definition of the simulation preorder between two SMs. 90 PNSE’17 – Petri Nets and Software Engineering q1 q2 q5 M c0 = (0, 0, 0) t = 0 q0 a c a (q0 , a, q1 ) b b c1 = (1, 0, 0) t = 1, t = 2 q1 a q5 q4 c (q0 , c, q5 ) c a c2 = (1, 0, 1) t = 3 q3 q2 a (q1 , a, q2 ) a c0 = (0, 1, 1) t = 4 (a) An FSM M (b) A part of the PCSM M ⊗ ∞ ∞ ∞ q0 q0 q0 a a a c c c b b b a a q1 a q5 q1 q5 q1 q5 1 q4 q4 1 q4 c a c a c a q2 q2 q3 a q2 1 q3 a q3 a (c) PCSM M ⊗ at the instant t = 0 (d) PCSM M ⊗ at the instant t = 1 (e) PCSM M ⊗ at the instant t = 2 ∞ ∞ q0 q0 a c a c b b q1 a q5 q1 a q5 1 q4 q4 c a c a 1 1 q3 a q2 1 q3 a q2 1 1 (f ) PCSM M ⊗ at the instant t = 3 (g) PCSM M ⊗ at the instant t = 4 Fig. 2. An example of execution of a sequence using a PCSM. Definition 6. (Simulation) 0 0 Let M = (ΣM , QM , FM , qM , δM ) and N = (ΣN , QN , FN , qN , δN ) be two SMs. A state p ∈ QM is simulated by a state q ∈ QN , denoted p (M,N ) q (p q when M and N are understood from context), iff the following two conditions hold: 1. ∀a ∈ ΣM and ∀p0 ∈ QM such that (p, a, p0 ) ∈ δM , there exists (q, a, q 0 ) ∈ δN such that p0 q 0 , and 2. if p ∈ FM , then q ∈ FN . Ennaoui et.al.: Complexity Aspects of Web Services Composition 91 M is simulated by N , denoted M N , iff the initial state of N simulates the initial sate of M. Example 2. Figure 3-(c) is an example of a simulation tree, verifying if the initial state s0 of the FSM A (figure 3-(a)) is simulated by the initial configuration c0 = (0, 0, 0) of the PCSM of M (figure 3-(b)). A branch is terminated with success when a terminal state of A is reached and paired with a final configuration (all intermidiate witnesses are null), or when a configuration of M ⊗ that covers one of its predecessors is reached and paired with the same state of A. In this case, the simulation tree proves that A M ⊗ . A s0 , c0 = (0, 0, 0) s0 a a b s1 , c1 = (1, 0, 0) s1 s2 c c b a c s3 , c0 = (0, 0, 0) s3 , c2 = (1, 0, 1) s2 , c1 = (1, 0, 0) Success Fail s3 a s1 , c5 = (0, 1, 0) a s1 , c3 = (2, 0, 0) (a)- An FSM A c c c M s3 , c6 = (0, 1, 1) s3 , c1 = (1, 0, 0) s3 , c4 = (2, 0, 1) Success b Fail Fail q0 a c s2 , c5 = (0, 1, 0) b q1 a q5 a q4 c a s1 , c7 = (1, 1, 0) c1 /c7 Success q3 q2 a (b)- An FSM M (c)- Simulation tree between A and M ⊗ Fig. 3. An example of a simulation tree. Interestingly, the simulation verification defined above can be seen as a two players game in a directed graph (Va , Vd , δ, v0 ), such that V = Va ∪Vd is the set of vertices with Va ⊆ QM ×QN and Vd ⊆ QM ×QN ×ΣM , δ ⊆ (Va ×Vd )∪(Vd ×Va ) is the edges set verifying: -for (q, p) ∈ Va and (q, a, q 0 ) ∈ δM , we have ((q, p), (q 0 , p, a)) ∈ δ; and -for (q, p, a) ∈ Vd and (p, a, p0 ) ∈ δN , we have ((q, p, a), (q, p0 )) ∈ δ. The game is played by an attacker and a defender. It starts by putting a 0 0 token in v0 = (qM , qN ) ∈ Va , then the players move it along the edges of the graph. If the token is on a vertex v ∈ Va then the attacker moves it, otherwise it is the defender’s turn. A strategy of a player x ∈ {a, d} is a function S : V ∗ .Vx 7→ V , where V ∗ .Vx denotes all sequences of vertices in V that end with a vertex in Vx and 92 PNSE’17 – Petri Nets and Software Engineering S(v0 , ..., vk ) = vk+1 implies that (vk , vk+1 ) ∈ δ. In each different play, a player x adapts a strategy that decides his moves. The defender wins every infinite play. Otherwise, the first player who can not move loses. M is simulated by N iff the defender has a wining strategy regardless of his opponent’s strategy. Observe that, by definition, each transition of a PCSM can at most increase or decrease a configuration component by 1. In addition, if a configuration is final then all intermediate states witnesses are equal to 0. Therefore, given a set of FSMs R and c ∈ CR⊗ , we have Σq∈SM ∈R I(Mi ) c[q] ≤ norm(c). Moreover, i since final states can only be simulated by final ones, then for M an FSM and p ∈ QM , if p c then norm(c) ≤ norm(p). Hence, we are able to derive the following property. Property 1. (Intermediate witnesses bound) [9] For c ∈ CR⊗ and p ∈ QM , M if p c then Σq∈SM ∈R I(Mi ) c[q] ≤ norm(p). We denote CR ⊗ = {c ∈ i CR⊗ |Σq∈ M ∈R I(Mi ) c[q] ≤ norm(M )}. S i In [9], the WSC problem in the unbounded case is reduced to simulation test between an FSM and a PCSM and this later problem is proved to be decidable. The proof of the termination of the algorithm given in [9] is based on the following property: Property 2. (configuration cover) [9] Let c and c0 be two configurations of R⊗ , such that: c[m] = c0 [m], m ∈ [1, nI (R)] and c[m] ≤ c0 [m], m ∈ [nI (R)+1, n]. if q c, where q is a state of a SM M, then q c0 . We say that c0 covers c, denoted c / c0 . We introduce below the algorithm of [9], focusing the presentation on the structure of its execution tree. Definition 7. (Simulation Tree) We call a simulation tree Tsim (M, R⊗ ) a tree (V, v0 , E) where: 0 – v0 = (qM , c0 ) is the root of the tree; M – V ⊂ QM × CR ⊗ is the set of nodes; – If (q, c) ∈ V and q is final in M then so is c in R⊗ ; – E ⊂ V × V is the set of the tree’s edges. ∀e = ((p, c), (q, d)) ∈ E : ∃a ∈ ΣM s.t (p, a, q) ∈ δM and (c, a, d) ∈ ΦR⊗ ; – v = (p, c) ∈ V is a leaf in Tsim (M, R⊗ ) iff p is terminal in A or there exists an ancestor (p, c0 ) ∈ V of v in Tsim (M, R⊗ ) such that c/c0 . In the next section, we shall bound the size of this tree in the case of bounded WSC problem (i.e., when the instances of services allowed to be used in the simulation is bounded by a parameter k). Ennaoui et.al.: Complexity Aspects of Web Services Composition 93 3 Bounded Composition We call a bounded WSC problem, a service composition problem where the number of copies of each web service in the repository R used to compose the target M is bounded a priori by an integer k. This problem is formally stated as follows. Problem 1. Bounded Composition BC(M, R, k) Input : R a set ofSFSMs; M a target FSM; k an integer. k Question : M i=0 R⊗i ? The particular case BC(M, R, 1) has been investigated by Muscholl and Walukiewicz [16] where it is shown to be Exptime-Complete. We shall prove in this section that BC(M, R, k) is also Exptime-Complete. We point out that the straightforward reduction of BC(M, R, k) to BC(M, R, 1), obtained by du- plicating k times each service of R, is not polynomial in the input size, since k may be large, and hence cannot be used to achieve our goal. The parameter k drops the infinite aspect and reduces the search space. In this case, a loop in M can only be simulated by loops S in R. For example, k one can observe that, in figure 4, St is not simulated by i=0 {R1 , R3 }⊗i for every k ∈ N. This is because when we repeat the loop in St (k + 1) times, Sk there is no corresponding execution in i=0 {R1 , R3 }⊗i . However, we have St Sk ⊗i i=0 {R1 , R3 } , for any k ≥ 1. St R1 R2 R3 a a a b a a b c b c Fig. 4. A yes instance of BC(M, R, k) with k = 1. In the following, Sk we give an upper bound of the number of states that might appear in i=0 R⊗i , with k ∈ N. Lemma Sk 1. Let R be a set of FSM and k is an integer. The number of states in ⊗i i=0 R is bounded by O(2nlogk ) where n = nI (R) + nH (R). 94 PNSE’17 – Petri Nets and Software Engineering Sk S+∞ Proof. Notice that R⊗ = ( i=0 R⊗i ) ∪ ( i=k+1 R⊗i ). Sk In fact, the states in i=0 R⊗i correspond to the PCSM’s configurations subset {c ∈ CR⊗k | 0 ≤ c[i] ≤ k, i ∈ [1, n]}. Hence, the number of states of Sk ⊗i i=0 R is bounded by (k + 1) × . . . × (k + 1) = 2nlog(k+1) . t u This lemma reduces the search space to an exponential size and leads to the following theorem. Theorem 1. BC(M, R, k) is Exptime-Complete Proof. Exptime. To show that BC(M, R, k) is Exptime, we bound the size of the simulation tree. A node of the simulation tree corresponds to (q, c) where q is a state of M and c a configuration of R⊗k . According to Lemma 1, the number of PCSM’s configurations is bounded by k n . So the number of nodes in the simulation tree is at most |QM | × k n = 2nlog(k)+log(|QM |) and therefore the complexity is in Exptime. Exptime-Hardness. It can be deducted directly from the Exptime-Hardness of the particular case BC(M, R, 1) [16]. t u Instead of the total number of instances used in the simulation, what happens if we bound only the number of instances used simultaneously? we raise this question in the next section and prove that the problem stays Exptime-complete. 4 Bounded parallel instances Now we consider a new parameter in service web composition that bounds the number of communications in parallel between the target and the services, i.e. the number of live services executions is bounded, but the number of instances is not. It appears that the web services composition with unbounded instances and bounded parallel instances is Exptime-Complete. To do so, we limit the configurations of the PCSM R⊗ to configurations where the number of waiting instances is bounded Pn by k. Indeed, when we need to use a new instance in ΦR⊗ , we check if i=1 c[i] ≥ k. If so, we decrease c[j] for some j ∈ [nI (R) + 1, nH (R)], i.e. we finish an instance that is waiting in an hybrid state. Let us denote by R⊗k,p the obtained PCSM. Problem 2. Bounded Parallel Instances Composition (P BC(M, R, k)) Input : R a set of FSMs; M a target FSM. k an integer, bounding the number of parallel instances of R’s components used simultaneously in the simulation. Question : M R⊗k,p ? Note that P BC(M, R, k) can use an unbounded number of instances but only k instances in parallel. Theorem 2. P BC(M, R, k) is Exptime-complete. Ennaoui et.al.: Complexity Aspects of Web Services Composition 95 Proof. First we show that P BC(M, R, k) is Exptime. Clearly the entry of any configuration is bounded by k (hybrid states are included) and therefore we can check simulation in Exptime, since the depth of the simulation tree is bounded by k n (see Lemma 1). To show the Exptime-hardness, it suffices to note that the unbounded composition without hybrid states U CHS(M, R, 0) is a particular case of P BC(M, R, k), since we prove later in theorem 4 that U CHS(M, R, 0) is Exptime-hard. In fact, the number of tokens in intermediate states of R is bounded by norm(M ) (property 1). Hence, when R is hybrid state free, the number of instances that can be used in the simulation is bounded by norm(M ). In other words, it corresponds to P BC(M, R, norm(M )). t u For k a constant, we obtain the following. Corollary 1. P BC(M, R, k) is polynomial when k is a constant. Proof. First of all, let us consider Pn for every configuration c of R⊗k,p , a new component c[n + 1] = k − ( i=1 c[i]), with n = nI (R) + nH (R). For every configuration c in R⊗k,p , the non-empty witnesses {c[i] > 0, ≤ i ≤ n + 1} correspond to a partition of k elements (instances) into a sequence of j non empty subsets, for j = |{c[i] > 0, 1 ≤ i ≤ n}| ≤ k. Note that j is in fact inferior to min(k, n), but since k is a constant then it is more interesting to keep it as a lower bound of j. For every j ≤ k, the number of labeled partitions of k elements into a se- quence of j non empty subsets is j! × {kj }, where {kj } is a Stirling number of the second kind [1]. Hence, the number of configurations in R⊗k,p that have j non-empty witnesses is bounded by Cnj ×j!×{kj }. Notice that Cnj = e n...×(n−j+1) j! is in the order of O(nj ). PkWe conclude that the number of configurations in R⊗k,p is bounded by j k k C j=1 n × j! × { j } ∈ O(n ). Finally, by applying the simulation algorithm in [10], P BC(M, R, k) can be decided in O(mv .me ), where mv = |QM | + |CR⊗ | and me ≤ |QM |2 + |CR⊗ |2 are respectively the number of edges and transitions in M and R⊗k,p . t u In the following, we show that P BC(M, R, k) is NP-Complete for loop-free target FSM. Let µ a sequence of letters (a word) over Σ and M the FSM that recognizes exactly µ. We call µ⊗ the language recognized by M ⊗ . We consider the following NP-complete Problem [13]. Problem 3. SHUFFLE PRODUCT Input : µ and µ0 two words over an alphabet Σ; Question : µ ∈ µ0⊗ ? Theorem 3. P BC(M, R, k) is NP-complete whenever M is loop-free. Proof. Clearly P BC(M, R, k) is in NP since the simulation relation is polyno- mial in the size of M . To show the NP-hardness, we reduce SHUFFLE PROD- UCT to it. Let µ and µ0 be an instance of SHUFFLE PRODUCT. We associate 96 PNSE’17 – Petri Nets and Software Engineering an FSM M which recognizes exactly µ and R = {N } where N is the FSM that recognizes exactly µ0 . Since M is a chain, then the size of a branch of the simu- lation tree can not surpass |µ|. Thus,the simulation verification will only explore |µ| R⊗k,p ’s executions where the size is bounded by |µ| ≤ k.|µ0 | with k = d |µ 0 | e and therefore the number of instances is bounded by k. Hence, µ ∈ µ0⊗ iff M R⊗k,p iff M R⊗ . We give an example in figure 5. t u M a b a c b c µ = {abacbc} a b c |µ| N µ0 = {abc} k = d |µ 0| e = 2 M N ⊗2,p Fig. 5. An example of SHUFFLE PRODUCT problem. Another factor of complexity of the WSC problem is the number of hybrid states in the available services. We investigate next the effect of this parameter on the complexity of the WSC problem. 5 Bounded number of hybrid states The presence of hybrid states is a source of complexity in a WSC problem. As mentioned before, the size of intermediate states witnesses in configurations of R⊗ used to simulate M is bounded by norm(M ). We are however unable to provide a similar bound for the number of hybrid states witnesses. Figure 6 is an example of simulation between an FSM M and a PCSM R⊗ . The FSMs in R contain two hybrid states (state 1 and 2) and no intermediate state. Hence, a configuration of R⊗ is a pair of integers witnessing the number of tokens in state 1 and state 2. The example illustrates the different roles that an hybrid state of R can play to simulate a state of M . Indeed an hybrid state of R, can be used as: (i) a terminal state, e.g., when testing whether q5 (1, 1), we can consider the second hybrid state of R as a terminal state and terminate the test, or (ii)an intermediate state, e.g., when testing whether q2 (1, 1), the second hybrid state of R here plays the role of intermediate state, or both a terminal and an intermediate state, e.g., when testing whether q1 (1, 0), a transition of ΦR⊗ labeled by (b, (−1, 0)) only appears in one branch in the simulation tree Tsim (M, R⊗ ). Hence, the first hybrid state of R⊗ is considered intermediate in one branch and terminal in the other, or a hybrid state, e.g., when it is used to simulate an hybrid state of H(M ). We consider in the following the problem defined below. Ennaoui et.al.: Complexity Aspects of Web Services Composition 97 M R Tsim(M, R⊗) q0 q0 , (0, 0) a, (1, 0) a a c q1 , (1, 0) q1 c, (0, 1) c 1 2 q2 , (1, 1) d, (0, −1) b, (−1, 0) d q2 b b d q4 , (1, 0) q3 , (0, 1) c, (0, 1) d, (0, −1) q4 q3 q5 , (1, 1) q5 , (0, 0) c q5 d (a) (b) Fig. 6. Example of the simulation tree Problem 4. Unbounded Composition With limited number of Hybrid States U CHS(M, R, k) Input : k an integer; R a set of FSMs, containing at most k hybrid states; M a target FSM. Question : M R⊗ ? It is worth noting that U CHS(M, R, k + 1) is harder then U CHS(M, R, k). In the sequel, we progressively investigate the complexity of U CHS(M, R, k) problem for k = 0, then for k = 1 and finally for k = 2. 5.1 Case of composition without hybrid states (i.e. k = 0) In this section, we are interested in the problem U CHS(M, R, 0). We first give a polynomial transformation, denoted K, which is used to reduce BC(M, R, 1) to U CHS(N, R0 , 0). This transformation provides a mean to bound the number of instances used to prove simulation. Definition 8. Transformation K. For an FSM M = (ΣM , QM , FM , q0M , δM ) and a set of FSMs R = {M1 , ..., Mm }, we define K(M, R)=(N, R0 ={N1 , .., Nm }) where: 1. Each Ni is built based on Mi , by adding a letter ti to its alphabet, a final state fi and a transition set {(q0Mi , ti , fi )} ∪ {(q, ti , fi )|q ∈ FMi }. All final states of Mi become intermediate in Ni . 2. N is defined as: – ΣN = ΣM ∪ {ti |1 ≤ i ≤ m}; – QN = QM ∪ {ri |1 ≤ i ≤ m}; – FN = {rm }; – δN = δM ∪ {(q, t1 , r1 )|q ∈ FM } ∪ {(ri , ti+1 , ri+1 )|1 ≤ i < m}. 98 PNSE’17 – Petri Nets and Software Engineering Figure 7 illustrates an example of this transformation. We prove later in proposition 2 that K defines a polynomial reduction of BC(M, R, 1) to U CHS(N, R0 , 0). In fact, the intuition behind this reduction is based on two points: – By adding the sequence of letters t1 , ...., tm at the end of every execution accepted by N and adding ti at the end of every execution accepted by Ni ∈ R0 , we ensure that even in an unbounded instances simulation, we can not use more than one instance of every Ni in order to simulate N . – The construction of R0 verifies that every hybrid state in Mi ∈ R becomes intermediate in Ni , while keeping its dual role: either terminate the execution by adding the letter ti to the execution of Ni and reaching the terminal state fi , or keep the execution in the same way as Mi . M R N R0 s0 p0 q0 s0 p0 q0 t1 a a b a a t2 b d s1 p1 d q1 s1 p1 q1 t2 b c b c c b b c a a s2 s3 p2 p3 s2 s3 p2 p3 q2 t1 t1 q2 b b t1 t2 s4 s4 r1 f1 f2 t1 t2 d d t1 r2 s5 s5 (a) (b) Fig. 7. An example of transformation K The following propositions show that the transformation K preserves the simulation preorder. Proposition 1. Let M be an FSM, R = {M1 , ..., Mm } be a set of FSMs and K(M, R) = (N, R0 = {N1 , .., Nm }). For p and q two states of respectively M and R⊗1 , we have: p (M,(R)⊗1 ) q iff p (N,(R0 )⊗1 ) q. Proof. By construction of K(M, R), if p (M,R⊗1 ) q and p is terminal in M then p (N,(R0 )⊗1 ) q. We suppose next that: If (p, a, p0 ) ∈ δM , (q, a, q 0 ) ∈ δR⊗1 and p0 (M,R⊗1 ) q 0 , then p0 (N,(R0 )⊗1 ) 0 q. and prove that p (N,(R0 )⊗1 ) q. For each (p, a, p0 ) ∈ δN , we have: Ennaoui et.al.: Complexity Aspects of Web Services Composition 99 – if a ∈ ΣM , then there exists (q, a, q 0 ) ∈ δR⊗1 ⊆ δ(R0 )⊗1 such that p0 (N,(R0 )⊗1 ) q 0 . – else a = t1 , p0 = r1 and q is a product of final states of R. therefore, there exists (q, t1 , q 0 ) ∈ δ(R0 )⊗1 such that q 0 = (f1 , q 0 i1 , ..., q 0 il ) where q 0 ij is final in R such that p0 (N,(R0 )⊗1 ) q 0 . We conclude that if p (M,R⊗1 ) q then p (N,(R0 )⊗1 ) q. Reciprocally, we have (p, a, p0 ) ∈ δN (respectively δ(R0 )⊗1 ) and a ∈ / {ti |1 ≤ i ≤ m} iff (p, a, p0 ) ∈ δM (respectively δR⊗1 ). In addition, the definition of K ensures that if p is final in M and p (N,(R0 )⊗1 ) q then q is final in R⊗1 . Hence if p (N,(R0 )⊗1 ) q then p (M,R⊗1 ) q. t u In particular, we take p as the initial state of M and q the initial state of R⊗1 . This implies that: Proposition 2. Let M be an FSM, R = {M1 , ..., Mm } be a set of FSMs and K(M, R) = (N, R0 = {N1 , .., Nm }). We have: M R⊗1 iff N (R0 )⊗ . Proof. We have N (R0 )⊗1 iff N (R0 )⊗ . Indeed, each path that starts from the initial state to a final one in N contains exactly one transition labelled by ti , for each i ∈ [1, m] and a similar path in each Ni contains exactly one transition labelled by ti . t u Hence, K is a polynomial reduction of BC(M, R, 1) problem to the UCHS prob- lem. This enables to derive the following result. Theorem 4. U CHS(M, R, 0) problem is Exptime-complete. Proof. According to proposition 2, the K transformation reduces BC(M, R, 1) to U CHS(M, R, 0) in polynomial time. Thus U CHS(M, R, 0) is Exptime-hard. Since it is also proven Exptime in [9], then U CHS(M, R, 0) is Exptime-complete. t u 5.2 Case of composition with one hybrid state We consider the problem U CHS(M, R, 1) where M is an FSM and R a set of FSMs containing at most one hybrid state (nH (R) ≤ 1). We denote k0 = |QM |.2nI (R).log(norm(M )) . Two nodes (q, c) and (q 0 , c0 ) in a simulation tree are called comparable if q = q 0 and either c/c0 or c0 /c. The nodes (q, c) and (q 0 , c0 ) are said incomparable otherwise. Property 3. Let R be a set of FSMs containing at most one hybrid state. Two configurations of R⊗ are comparable by the cover relation, iff they have exactly the same intermediate witnesses. Proof. Acoording to property 2, for c, c0 two configurations in R⊗ we have c / c0 iff : 1. c and c0 have the same intermediate witnesses; and 100 PNSE’17 – Petri Nets and Software Engineering 2. for every hybrid witness c[h], we have: c[h] ≤ c0 [h]. In the current case, we consider that R has at most one hybrid witness. Hence, for any pair of configurations of R⊗ , condition 2 is verified. We conclude that for every two configurations c, c0 in R⊗ , c / c0 iff c and c0 have the same intermediate witnesses. t u Property 4. Let S be a set of nodes of Tsim (M, R⊗ ) that are pairwise incompa- rable, then |S| ≤ k0 . Proof. In configurations considered in Tsim (M, R⊗ ), intermediate witnesses are bounded by norm(M ) (property 1). Therefore and according to property 3, the number of incomparable configurations considered in Tsim (M, R⊗ ) is at most 2nI (R).log(norm(M )) . Since S ⊂ QM × CR⊗ , then |S| ≤ k0 . t u Proposition 3. If nH (R) ≤ 1, then foreach (q, c) ∈ Tsim (M, R⊗ ), c[h = (nI (R) + 1)] ≤ k0 2 . Proof. let P be a path in Tsim (M, R⊗ ), Int be an interval in N and S = (vn = (qn , cn ))n∈Int be a sequence of nodes in P such that: -vi is the ith node met in P that is comparable to one of its predecessors v = (qi , c); and -For each i, j ∈ Int, vi and vj are incomparable. If Int = ∅, then all nodes of P are not comparable. The size of P is then bounded by k0 , therefore, c[nI (R) + 1] ≤ k0 for each (q,c) in P. We suppose next that Int 6= ∅ and take Int = [1, k], k ∈ N. We prove recursively that for each l ∈ [1, k], cl [h] ≤ l.k0 . For l = 1, we have c1 [nI (h] ≤ k0 . For 1 < l < k, we suppose that cl [h] ≤ l.k0 . Each node v = (q, c) between vl and vl+1 in P is either: 1. comparable to a node vi with i ∈ [1, l]. In this case, c[h] < ci [h] ≤ l.k0 (otherwise v should be a leaf). 2. incomparable to all its predecessors. The number of such nodes is bounded by k0 . And since transitions displacements is in {−1, 0, 1}h , then we have c[h] < l.k0 + k0 . Therefore cl+1 [h] ≤ (l + 1).K0 . Once we reach vk , each one of its possible successors v = (q, c) is comparable to a node vi with c[h] < ci [h], except for the last one that is the leaf of P. Finally, since k < k0 (because S is a sequence of incomparable nodes), we conclude that each node of P is in QA × ([1, norm(A)]I × [1, k0 2 ]). t u Ennaoui et.al.: Complexity Aspects of Web Services Composition 101 Since deciding simulation only requires to visit a node once, we argue next that this problem is in APspace (i.e a problem that can be solved by an alter- nating Turing machine in polynomial space): the size of a position of the simu- lation tree is polynomial in the input size (Proposition 3). Hence a polynomial space alternating turing machine can solve this simulation problem: universal states correspond to the target’s and existential states correspond to the shuffle product’s configurations. Note that APspace corresponds to Exponential time complexity. Given the above, we conclude that: Lemma 2. U CHS(M, R, 1) is in Exptime. To prove the Exptime-hardness of the problem, we recall that U CHS(M, R, 0) is Exptime-hard (theorem 4) and that U CHS(M, R, 1) is harder than U CHS(M, R, 0). Theorem 5. U CHS(M, R, 1) is Exptime-complete. 5.3 Case of composition with two hybrid states In this section, we consider the problem of unbounded composition of web ser- vices with at most 2 hybrid states in R, i.e. U CHS(M, R, 2). Our approach is based on relating this simulation problem to the reachability issue. For x ∈ Ni1 and y ∈ Ni2 , we denote the concatenation of two vectors x and y, (x.y) ∈ Ni1 +i2 such that: x[j] if j ∈ [1, i1 ] (x.y)[j] = y[j] if j ∈ [i1 + 1, i1 + i2 ] We define the 2-dimension Vector Addition System with States (VASS) [11] VM,R as follows: Pi=n (R) – States: S ⊆ QM × {c ∈ NnI (R) | i=1 I c[i] ≤ norm(M )}. 2 – Transitions: W ⊆ S × S × {−1, 0, 1} such that: ((q, c), (p, d), x) ∈ W iff there exists a ∈ ΣM , y ∈ N2 such that (p, a, q) ∈ Pi=nI (R) QM and ((c, y), a, (d, y + x)) ∈ ΦR⊗ and i=1 c[i] ≤ norm(M ) and Pi=nI (R) i=1 d[i] ≤ norm(M ). – Initial configuration: the system starts with the state (q0M , {0}nI (R) ) and the vector (0, 0). Figure 8 depicts an example of a VASS associated to an FSM M and a set of FSMs R. 102 PNSE’17 – Petri Nets and Software Engineering M R VM,R q0 , 0 (1, 0) q1 , 0 q0 (0, 0) (0, −1) (−1, 0) a a a (0, 1) q2 , 0 (1, 0) (1, 0) q1 , 0 q3 , 0 q4 , 0 q1 2 1 (0, 0) c b (−1, 0) (0, −1) (0, −1) c b a (0, 0) q3 q2 q3 , 1 a q4 , 1 (1, 0) 3 (0, 0) (0, −1) a q4 , 2 q4 (a) (b) Fig. 8. An example of a VASS associated to an FSM M and an FSMs set. The reachability issue in 2-dimension VASSs has been investigated by Hopcroft and Pansiot [11] in the general case where displacements are in N2 . [11] gives an algorithm to prove the semi-linearity of the reachability set of such systems. The algorithm builds a tree Treach labeled by 3-tuples (p, c, Ac ) where p is the current state, c ∈ N2 is a vector reached in the system and Ac ⊂ N2 . (p, c, Ac ) denotes that every vector in the linear set {c + α1 a1 + ... + αn an |i ∈ [1, n], ai ∈ Ac and αi ∈ N} can be reached in state p from the initial configura- tion. We consider in the following a simulation tree Tsim (M, R⊗ )=(V, v0 , E), a recheability tree Treach (VM,R )=(V 0 , v00 , E 0 ) and a function π defined as follows: π: V0 → V ((p, c), x, Ax ) 7→ (p, (c.x)) The following proposition enables to establish a connection between paths in a simulation tree and a corresponding recheability tree. Proposition 4. Let µ = v0 ...vt be a path in Tsim (M, R⊗ ). Then there exists a path µ0 = v00 ...vt0 in Treach (VM,R ) such that vi = π(vi0 ), i ∈ [0, t]. Proof. We proof by induction on the length i of the path µ = v0 ...vt . For i = 0 we have v0 = (q0M , c0 ) = π((q0M , {0}nI (R) ), x0 , Ac0 ) = π(v00 ). Now suppose that the property is true for i < t and v0 ...vi+1 is a path in Tsim (M, R⊗ ). Then by hypothesis there exists a path v00 ...vi0 in Treach (VM,R ), such that vj = π(vj0 ), j ∈ [0, i]. Suppose that vi0 is a leaf in Treach (VM,R ). Then according to the algorithm of Hopcroft and Pansiot [11], we have either: Ennaoui et.al.: Complexity Aspects of Web Services Composition 103 – There exist j ∈ [0, i − 1] such that vj0 = ((p, c), y, Ax ) and vi0 = ((p, c), x, Ax ) with y ≤ x (see Algorithm ??, line 1). This implies that vj /vi , which con- tradicts that vi is not a leaf in Tsim (M, R⊗ ), i.e. (c, y)/(c, x). – There is no transition from vi0 in the system (see Algorithm ??, line 1). But for vi = (p, (c, x)) and vi+1 = (q, (d, y)) we have vi vi+1 ∈ E which means that (p, a, q) ∈ δM and ((c, x), a, (d, y)) ∈ ΦR⊗ . This implies that ((p, c), (q, d), y − x) ∈ W . Contradiction. 0 Therefore, we have : vi+1 = ((q, d), y, Ay ) is a successor of vi0 in Treach (VM,R ), with vi+1 = (q, (d, y)). We conclude that µ0 is a path in Treach (VM,R ). t u The following corollary is a consequence of Proposition 4. Corollary 2. Tsim (M, R⊗ ) is a sub-tree of Treach (VM,R ). Clearly the time complexity for computing Tsim (M, R⊗ ) is dominated by the complexity of computing Treach (VM,R ). Moreover we know from [12] that the size of Treach (VM,R ) is in 2-Exptime. Hence, we derive the following complexity result. Theorem 6. U CHS(M, R, 2) is in 3-Exptime. α Proof. According to [12], the size of Treach (VM,R ) is of order O(22 ) where α = max(|S|, |W |) ≤ c × (|QM | × N orm(M )nI (R) )2 with c is a constant. Then 2c1 +c2 ∗β according to Corollary 2, the size of Tsim (M, R⊗ ) is bounded by 22 where c1 and c2 are constants and β = log(|QM |) + nI (R) × log(N orm(M )). t u Our proof for Theorem 6 can be seen more as an embedding of the search space explored by a simulation test to the one explored when the reachability issue is considered. This is an approach that can not so far be generalized because the best upper bound provided for vector addition systems reachability is non- primitive recursive; in fact even the existence of a primitive upper-bound is still open [14]. 6 Conclusion In this paper we have considered two parameters that are source of complex- ity of the web services composition problem. We have shown that among the considered problems, several instances remain Exptime-complete when a pa- rameter is bounded. It remains an open question to identify the complexity of U CHS(M, R, k) for any k ∈ N; [5] proves in the context of Z-Reachability that the problem is k-Exptime. This complexity is quite far from the known lower bound (2-Exptime). It is also interesting to improve the polynomial complex- ity given for k=2 in [7] (polynomial of the 17th degree) and/or give a simpler algorithm that can eventually be extended to the general case. 104 PNSE’17 – Petri Nets and Software Engineering References 1. Milton Abramowitz and Irene A. Stegun. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York, ninth dover printing, tenth gpo printing edition, 1964. 2. Gustavo Alonso, Fabio Casati, Harumi Kuno, and Vijay Machiraju. Web Services: Concepts, Architectures and Applications. Springer, 2010. 3. B. Benatallah, F. Casati, and F. Toumani. Web Service Conversation Modeling: A Cornerstone for E-Business Automation. IEEE Internet Computing, 08(1):46–54, 2004. 4. D. Berardi, D.Calvanese, G. De Giacomo, R. Hull, and M. Mecella. Automatic composition of transition-based semantic web services with messaging. In VLDB, pages 613–624, 2005. 5. Tomáš Brázdil, Petr Jančar, and Antonín Kučera. Reachability games on extended vector addition systems with states. In Automata, Languages and Programming, pages 478–489. Springer, 2010. 6. T. Bultan, X. Fu, R. Hull, and J. Su. Conversation specification: a new approach to design and analysis of e-service composition. In WWW’03. ACM, 2003. 7. Jakub Chaloupka. Z-reachability problem for games on 2-dimensional vector addi- tion systems with states is in p. In Reachability Problems, pages 104–119. Springer, 2010. 8. Jean-Baptiste Courtois and Sylvain Schmitz. Alternating vector addition systems with states. In Mathematical Foundations of Computer Science 2014, pages 220– 231. Springer, 2014. 9. Ramy Ragab Hassen, Lhouari Nourine, and Farouk Toumani. Protocol-based web service composition. In ICSOC, pages 38–53, 2008. 10. Monika Rauch Henzinger, Thomas A Henzinger, and Peter W Kopke. Computing simulations on finite and infinite graphs. In Foundations of Computer Science, 1995. Proceedings., 36th Annual Symposium on, pages 453–462. IEEE, 1995. 11. John Hopcroft and Jean-Jacques Pansiot. On the reachability problem for 5- dimensional vector addition systems. TCS, 8(2):135–159, 1979. 12. Rodney R Howell, Louis E Rosier, Dung T Huynh, and Hsu-Chun Yen. Some com- plexity bounds for problems concerning finite and 2-dimensional vector addition systems with states. TCS, 46:107–140, 1986. 13. Joana Jedrzejowicz and Andrzej Szepietowski. Shuffle languages are in p. Theo- retical Computer Science, 250(1-2):31–53, 2001. 14. S Rao Kosaraju. Decidability of reachability in vector addition systems. In Pro- ceedings of the fourteenth annual ACM symposium on Theory of computing, pages 267–281. ACM, 1982. 15. Slawomir Lasota. Expspace lower bounds for the simulation preorder between a communication-free petri net and a finite-state system. Inf. Process. Lett., 109(15):850–855, 2009. 16. Anca Muscholl and Igor Walukiewicz. A lower bound on web services composition. Logical Methods in Computer Science, 4(2), 2008.