1 Introduction

Scalable Web Service Composition with Partial Matches

Adina Sirbu

Jo¨rg Hoffmann?

0 0 Digital Enterprise Research Institute (DERI) University of Innsbruck , Austria

We will investigate scalable algorithms for automated Semantic Web Service Composition (WSC). Our notion of WSC is very general: it allows the generation of new constants by Web service outputs; the composition semantics includes powerful background ontologies; and we use the most general notion of matching, partial matches, where several services can cooperate each covering only a part of a requirement. Herein, we define a first formalism. We show that automatic composition is very hard: even testing solutions is Πp2-complete. We identify a special case that covers many relevant WSC scenarios, and where solution testing is only coNP-complete. While coNP is still hard, in the area of planning under uncertainty, scalable tools have been developed that deal with the same complexity. In our next step, we will adapt the techniques underlying one of these tools to develop a scalable tool for WSC. In the long term, we will investigate richer formalisms, and accordingly adapt our algorithms.

1 Introduction

In any task that involves automatic processing of Web services, one needs support for background ontologies. In the case of WSC, almost all existing solutions compile the problem into AI Planning formalisms. The motivation is that planning tools have become many times more scalable in recent years, through the use of heuristic functions and other search techniques, e.g. [ 8 ]. The problem here is that those tools cannot handle background ontologies. The following example illustrates the importance of those: Example 1. A service shall be composed that inputs a constant of concept A, and outputs one of concept C. (E.g., A may be a trip request and C a ticket.) The ontology defines the concepts A, B, B1, . . . , Bn, C, and states that each Bi ⊆ B, and S1n Bi ⊇ B. (E.g., B may be a geographical region and the Bi its parts.) An available service wsAB transforms A into B, and for each i an available service wsBiC transforms Bi into C. A solution first applies wsAB, and then applies the wsBiC in conjunction.

In Example 1, reasoning over the background ontology is necessary to (1) understand which services can be used, and to (2) test whether a given composition is actually a solution. Such reasoning can be modelled through the background theories explored in some planning works, e.g., [ 4, 6 ]. However, incorporating this notion into the modern scalable planning tools poses serious challenges, and has not yet even been tried. Due to the background theory, even computing a state transition – which is now a form of belief revision – is a computationally very hard task. The existing planning tools dealing with background theories, e.g., [ 4, 6 ], map the problem into generic deduction, which is well known for its lack of scalability. The existing planning tools dealing with WSC, e.g., [ 9, 2 ], ignore the ontology and assume exact matches. In Example 1, this would require B = Bi instead of B ∩ Bi 6= ∅. Obviously, this renders the example unsolvable.

We identify an interesting special case of WSC. We exploit the fact that Web services may output new constants.1 Now, if all ramifications of a Web service concern only propositions involving at least one new constant, then a belief revision is not necessary. We term this special case WSC with forward effects: the effects are “forward” in the sense that no backwards-directed belief revision is necessary. E.g., the services in Example 1 have forward effects. The same holds for many WSC scenarios from the literature, and from real case studies. A simple example is the wide-spread “virtual travel agency”, where Web services must be linked that book travel and accommodation, generating new constants corresponding to tickets and reservations.

We introduce a framework for planning with background theories.2 We show that testing whether an action sequence is a solution – solution testing – is Π2p-complete in general, but only coNP-complete with forward effects. Of course, coNP is still hard. However, planning under uncertainty has the same complexity of solution testing, and scalable tools for this case have already been developed. Adapting them for WSC with forward effects will be our next step. In particular, the Conformant-FF tool [ 7 ] is based on CNF reasoning, which can be naturally extended to our setting.

Section 2 introduces our formalism, Section 3 discusses forward effects. Sections 4 provides some details regarding our future developments, Section 5 concludes. 2

WSC with Partial Matches

The (planning) terminology of our formalism corresponds to WSC is as follows. Web services are planning ”operators”; their input/output behaviour maps to input/output parameters on which preconditions and effects are specified [ 1, 3, 5 ]. The background ontology is the background ”theory”. The precondition in the goal is equivalent to sets of ”initial literals” and ”initial constants”. The effect in the goal is the ”goal condition”.

We assume supplies of logical predicates p, q, variable names x, y and constant names a, b, c, d, e; (ground) literals are defined as usual. For variables X, LX is the set of literals using only variables from X. We write l[X] for a literal l with variable arguments X. For a tuple C of constants substituting X, we write l[C/X]. In the same way, we use the substitution notation for any construct involving variables. Positive ground literals are propositions. A clause is a disjunction of literals with universal quantification on the outside, e.g. ∀x.(¬p(x) ∨ q(x)). A theory is a conjunction of clauses. An operator o is a tuple (Xo, preo, Yo, effo), where Xo, Yo are sets of variables, preo is a conjunction of literals from LXo , and effo is a conjunction of literals from LXo∪Yo . The intended meaning is that Xo are the inputs and Yo the outputs, i.e., the new constants created by the operator. For an operator o, an action a is given by (prea, effa) ≡ (preo, effo)[Ca/Xo, Ea/Yo] where Ca and Ea are vectors 1 Namely, the outputs model the generated data. 2 [ 10 ] defines a similar WSC formalism, but considering plug-in matches and restricting the background theory, instead of the service effects, to obtain efficiency. of constants; for Ea we require that the constants are pairwise different. In this way ”operators” are Web services and ”actions” are service calls. W SC tasks are tuples (P , T , O, C0, φ0, φG). Here, P are predicates; T is the theory; O is a set of operators; C0 is a set of constants, the initial constants supply; φ0 is a conjunction of ground literals, describing the possible initial states; φG is a conjunction of literals with existential quantification on the outside, describing the goal states, e.g., ∃x, y.(p(x) ∧ q(y)).3 All predicates are from P , all constants are from C0, all constructs are finite.

Assume we are given a task (P , T , O, C0, φ0, φG). States in our formalism are pairs (Cs, Is) where Cs is a set of constants, and Is is a Cs-interpretation , i.e., an interpretation of all propositions formed from the predicates P and the constants Cs. We need to define the outcome of applying actions in states. Given a state s and an action a, a is applicable in s if Is |= prea, Ca ⊆ Cs, and Ea ∩ Cs = ∅. We allow parallel actions. These are sets of actions which are applied at the same point in time. The result of applying a parallel action A in a state s is res(s, A) :=

[ a∈A,appl(s,a) Here, min(s, C0, φ) is the set of all C0-interpretations that satisfy φ and that are minimal with respect to the partial order defined by I1 ≤ I2 :iff for all propositions p over Cs, if I2(p) = Is(p) then I1(p) = Is(p). This is a standard semantics where the ramification problem is addressed by requiring minimal changes to the predecessor state s [ 11 ]. A is inconsistent with s iff res(s, A) = ∅; this can happen in case of conflicts. Note that res(s, A) allows non-applicable actions. This realizes partial matches: a Web service can be applied as soon as it matches at least one possible situation.

We refer to the set of states possible at a given time as a belief. The initial belief is b0 := {s | Cs = C0, s |= T ∧ φ0}. A parallel action A is inconsistent with a belief b if it is inconsistent with at least one s ∈ b. In the latter case, res(b, A) is undefined; else, it is Ss∈b res(s, A). This is extended to action sequences in the obvious way. A solution is a sequence hA1, . . . , Ani s.t. for all s ∈ res(b0, hA1, . . . , Ani) : s |= φG.

When assuming fixed arity – a constant upper bound on the arity of all variable vectors (e.g., used in predicates) – transformation to a propositional representation is polynomial. Even in this case, solution testing is Π2p-complete in W SC. Further, we have proved that polynomially bounded solution existence is Σ3p-complete.

Theorem 1 (Solution testing in W SC). Assume a W S C task with fixed arity, and

a sequence hA1, . . . , Ani of parallel actions. It is Π2p-complete to decide whether hA1, . . . , Ani is a solution. 3

Forward Effects

The high complexity of W SC motivates the search for interesting special cases. As stated, here we define a special case where every change an action makes to the state involves a new constant. A W SC task (P , T , O, C0, φ0, φG) has forward effects iff: 3 The existential quantification is needed to give meaning to the creation of new constants. – For all o ∈ O, and for all l[X ] ∈ effo, we have X ∩ Yo 6= ∅. In words, the variables of every effect literal contain at least one output variable. – For all clauses cl[X ] ∈ T , where cl[X ] = ∀X.(l1[X1]∨· · ·∨ln[Xn]), we have X =

X1 = · · · = Xn. In words, in every clause all literals share the same arguments. The set of all such tasks is denoted with W SC|f wd. The second condition implies that effects involving new constants can only affect literals involving new constants. Given a state s and a parallel action A, define res|f wd(s, A) :=

[ a∈A,exec(s,a)

Hence, the action semantics becomes a lot simpler with forward effects, no longer needing the notion of minimal changes with respect to the previous state. We get:

Proposition 2 (Solution testing in W S C|f wd). Assume a W S C|f wd task with fixed

arity, and a sequence hA1, . . . , Ani of parallel actions. It is coNP-complete to decide whether hA1, . . . , Ani is a solution.

It is currently an open problem what the complexity of deciding polynomially bounded solution existence is in W S C|f wd. With Proposition 2, membership in Σp2 is easy to see. However, we suspect that the problem is actually coNP-complete. We have one half of a proof, but some tricky issues must still be resolved regarding the generation of exponentially many constants. 4

Tool and Language Developments

Our next step will be to develop a tool for W SC|f wd. We focus on achieving scalability: we expect practical SWS scenarios to involve large sets of Web services, involving huge search spaces (of partial compositions). We will try to overcome this by designing heuristic solution distance and search node filtering techniques. Specifically, we will start from the ideas underlying the Conformant-FF (CFF) [ 7 ] planning tool.

CFF represents beliefs in terms of propositional CNF formulas, and uses SAT reasoning to test solutions. We will adapt this to our purposes, simply by adapting the generation of the CNFs. Note that this is possible only in W SC|f wd, not in W SC, for complexity reasons. CFF also introduces heuristic and filtering techniques, based on an abstraction of the planning problem. Namely, for each belief CFF computes an abstract solution using approximate SAT reasoning, and then uses the abstract solution to inform the search. We will apply the same principle for W S C|f wd. This will differ from CFF in the different structure of our CNFs, necessitating different approximate reasoning, and in that we will explore typical forms of background theories (e.g., subsumption hierarchies) to obtain better distance estimates. Further, in difference to us, CFF (and indeed every existing planning tool) does not allow the generation of new constants. We will devise new heuristics for dealing with this. Note that this is critical: as already indicated, exponentially many constants may be generated in general, so one needs heuristics identifying which are important. Those heuristics need to be clever: if they remove too many constants, then the solutions may be cut out; if they remove too few constants, then the search space may explode.

In the long term, our line of research will be to incrementally enrich the language our tool accepts, accordingly adapting the algorithms. For a start, some generalisations of W SC|f wd are possible without losing Proposition 1. Most importantly, instead of requiring that every effect literal involves a new constant, one can postulate this only for literals that may actually be affected by the background theory. This may be important, e.g., , for dealing with updates on attributes of existing constants. If such a language turns out to not be enough for many practical examples, then we will look into how feasible it is to drop the forward effects restriction and deal with full W SC. Note that, due to Theorem 1, this would require QBF solving for reasoning about beliefs. We further plan to investigate into allowing non-deterministic outcomes of Web services. These would be modelled as operators with lists of alternative effects, each of which may occur. We expect that we can handle this by appropriately extending the CNF formulas underlying beliefs, as well as the associated machinery. 5

Conclusion

We have introduced a formalism for WSC, and identified a special case for which no belief revision is necessary. We will solve some open complexity problems, and develop a tool inspired from CFF. We will incrementally extend the tool to richer languages.

Ankolenkar et al. DAML-S: Web service description for the semantic web . In ISWC , 2002 .

Akkiraju ,

Srivastava ,

Anca-Andreea ,

Goodwin , and

Syeda . Semaplan: Combining planning with semantic matching to achieve web service composition . In ICWS , 2006 .

Martin et al. OWL-S: Semantic markup for web services . In SWSWPC , 2004 .

Eiter ,

Faber ,

Leone , G.

Pfeifer, and

Polleres . A logic programming approach to knowledge-state planning, II: The DLVK system . AI , 144 ( 1-2 ): 157 - 211 , 2003 .

Fensel ,

Lausen ,

Polleres , J. de Bruijn,

Stollberg ,

Roman , and

Domingue . Enabling Semantic Web Services: The Web Service Modeling Ontology . Springer-Verlag, 2006 .

Giunchiglia ,

Lee ,

Lifschitz ,

McCain ,

and H.

Turner . Nonmonotonic causal theories . AI , 153 ( 1-2 ): 49 - 104 , 2004 .

Hoffmann and

Brafman . Conformant planning via heuristic forward search: A new approach . AI , 170 ( 6-7 ): 507 - 541 , May 2006 .

Hoffmann and

Nebel . The FF planning system: Fast plan generation through heuristic search . JAIR , 14 : 253 - 302 , 2001 .

Ponnekanti and

Fox. SWORD : A developer toolkit for web services composition . In WWW , 2002 .

10.

Scicluna and

Hoffmann . Semantic web service composition: Background theories, parallelisation, and business policies . In ESWC PhD Symposium , 2007 . Submitted.

11.

Winslett . Reasoning about actions using a possible models approach . In AAAI , 1988 .