A Prototype for Discovering Compositions of Semantic Web Services? Antonio Brogi∗ , Sara Corfini∗ , José F. Aldana† , Ismael Navas† ∗ Department of Computer Science – University of Pisa, Italy Email: {brogi,corfini}@di.unipi.it † Department of Computer Languages and Computing Science – University of Málaga, Spain Email: {jfam,ismael}@lcc.uma.es Abstract— Semantic Web services provide semantically en- typically described in terms of different ontologies. Another hanced service descriptions which allow the development of important feature is the capability of discovering service automated service discovery and composition. Yet, an efficient compositions rather than single services. Indeed it is often semantic-based service discovery remains a major open challenge towards a wide acceptance of semantic Web services. In this the case that a client query cannot be fulfilled by a single paper we present a fully-automated matchmaking system for service, while it may be fulfilled by a suitable composition discovering sets of OWL-S Web services capable of satisfying a of services. Last, but not least, efficiency is obviously an given client request. important objective of service discovery mechanisms. In this perspective, we described in [6] the design of a I. I NTRODUCTION composition-oriented methodology for discovering OWL-S Web services are internet-based, self-describing and plat- SWSs. Such methodology employs the notion of Semantic form-independent application components published using Field [14] to cross different ontologies and it achieves ef- standard interface description languages (WSDL [23]) and ficiency by pre-computing off-line all the query-independent universally available via standard communication protocols tasks, viz., to determine the dependencies within/among ser- (SOAP [22]). Web services are emerging as the new funda- vices as well as the relationships among ontologies. An hyper- mental elements for developing complex software applications. graph is employed to collect all the computed dependencies. However, a prerequisite to reusing and composing Web ser- The search algorithm of [6] analyses the hypergraph in order vices is the ability to find the right service(s). to discover the sets of services (candidated to be composed) The only accepted standard for service discovery is UDDI capable of satisfying a given client request, which specifies [21], which performs a keywords-based search on registries inputs and outputs of the desired service. of WSDL services. One of the main limitations of the current In this paper we develop the methodology proposed in [6] service standards is that they do not provide an explicit seman- by introducing the algorithms needed to pre-process ontolo- tics. Indeed, two identical WSDL documents may describe two gies, to pre-compute service dependencies as well as to dis- completely different services. A promising approach seems cover service compositions. We also present a first prototype to be semantically modelling service capabilities in order to of our system which interacts with the clients by means of a enable a discovery based on what the services really provide. suitable Web interface. The application of the Semantic Web [4] model to the Web The rest of the paper is organised as follows. Sections II and service area has recently promoted the development of so- III are devoted to present the discovery system. In Section II called Semantic Web Services. A semantic Web service (SWS) we define the data structure we use to collect information about is a software component which self-describes its functionali- ontology concepts, services and their dependencies, while in ties by annotating them with (instances of) concepts formally Section III we describe the search algorithm for discovering defined by means of ontologies. One of the most promising Web service compositions. A first prototype of the discovery languages for describing SWSs is the Ontology Web Language system is presented in Section IV. Finally, related work is for Web Services (OWL-S [16]), which provides each service discussed in Section V, while some concluding remarks are with three documents: the service profile (“what the service drawn in Section VI. does”), the process model (“how the service works”) and the service grounding (“how to access the service”). II. T HE DEPENDENCY HYPERGRAPH The development of semantics-based service discovery mechanisms constitutes a major open challenge in this context, In order to store the knowledge derived from the pre- and it raises several important issues. One of them is the ability processing of ontologies and service descriptions, we need a of coping with different ontologies, as different services are data structure capable of suitably representing service and data ? Work partially supported by the SMEPP project (EU-FP6-IST 0333563) dependencies. The hypergraph seems to be a good candidate and by the project TIN2005-09098-C05-01 (Spanish Ministry of Education as dependencies can be naturally modelled by means of and Science). hyperedges. According to [7], a directed hypergraph H = (V, E) is a between pairs of concepts and next we calculate the distance pair, where V is a finite set of vertices and E is a set of between pairs of ontologies (i.e., how similar two ontologies directed hyperedges. A directed hyperedge is an ordered pair are). Mappings between concepts can be determined by means (X, Y ) of (possible empty) disjoint subsets of V , where X and of different ontology matching tools, which check whether two Y denote the tail and the head of the hyperedge, respectively. concepts are equivalent with a given probability. In this context, the vertices of the hypergraph correspond Since the results depend on the employed matching tools, to the concepts defined in the ontologies employed by the we have hence developed a framework for adapting existent analysed service descriptions, while the hyperedges represent matching tools and combine them in order to improve the relationships among such concepts. More precisely, an hyper- obtained results. In the framework instance used in this paper edge has one of the following three types: we combine the results of individual matchers which analy- • E⊂ = (D, {c}, nil) – subConceptOf relationship. Let c ses the similarity between pairs of concepts with different be a concept defined in an ontology O and let D ∈ O be strategies: (1) name similarity: compares the names of the the set of the (direct) subconcepts of c. Then, there is a provided concepts, (2) data type similarity: compares the E⊂ hyperedge from D to c. types of the provided concepts, and establishes a similarity • E≡ = ({e}, {f }, sim) – equivalentConceptOf relation- value depending on the possibility of producing each type ship. Let e, f be two concepts defined in two separate from another type by casting, (3) WordNet similarity: checks ontologies and let sim be the similarity between e and whether the two concepts are synonyms in WordNet [18], (4) f , i.e., the probability that e is (semantically) equivalent path similarity: compares the path from each concept to its to f . If sim is above a given similarity threshold, there root in the ontology, and provides an average between the path is a E≡ hyperedge from e to f labelled by sim. length similarity and the name similarity between all concepts • ES = (I, O, s) – intra-service dependency. Let s be (a in both paths, (5) property similarity: compares the similarities profile of) a service and let I be the set of inputs that s among the properties of the concepts compared, and (6) edit requires to produce the set O of outputs. Then, there is distance similarities: this family of algorithms is based on the a ES hyperedge from I to O labelled by s. comparison of strings, and calculates the difference between It is worth noting that the proposed hypergraph automati- them as the operations that should be performed on one string cally models other important aspects regarding the registry- to obtain another. As these algorithms calculate the distance published services and ontologies. The hypergraph models between two concepts, it is necessary to calculate the similarity an extended notion of subConceptOf between two concepts from the value provided. c, d defined in two given ontologies. More specifically, c is The calculus of mappings is done between pairs of ontolo- subConceptOf d (hereafter, c 7→ d) if and only if there exists a gies, hence if we have N ontologies then we have to calculate path from c to d which consists of subConceptOf relationships N 2 − N matrixes. Assuming that the mapping between two (E⊂ hyperedges) and/or equivalentConceptOf relationships ontologies is a symmetric relationship we need to calculate (E≡ hyperedges). Moreover, the hypergraph directly repre- (N 2 − N )/2 matrixes. Yet, our tool calculates the similarity sents the inter-service dependencies. Indeed, there is a inter- matrixes between the new ontology and all the previously service dependency between two services s and t if the head published ontologies when the new one is registered. Thus, of a s-labelled ES hyperedge and the tail of a t-labelled ES each insertion only requires N − 1 calculations. hyperedge share (at least) a concept. Note that there is a inter- Once the mappings between two different ontologies are service dependency between s and t also if there exist two established, the distance between the ontologies is calculated concepts cs and ct such that cs 7→ ct , cs belongs to the head by making use of some ontology distance measurements. of the s-labelled ES hyperedge and ct belongs to the tail of Given a matrix with the similarity between pairs of concepts the t-labelled ES hyperedge. of two ontologies (O1 and O2 ), we can make use of several In the following subsections we describe how subConceptOf measurements based on the definitions of distance in different relationships, equivalentConceptOf relationships and intra- contexts. First, we define the directed distance (DD) between service dependencies can be determined. them (as it is an asymmetric measure): A. The SemFiT application #Concepts(O1 ) DD(O1 , O2 ) = P Semantic Fields [14], [1] are groups of interrelated ontolo- c∈Concepts(O1 ) max(mappings(c, O2 )) gies that may be relevant to a given information request, in which the client needs to find related ontologies, from a set of D(O1 , O2 ) = min(DD(O1 , O2 ), DD(O2 , O1 )) ontologies, for a given set of concepts of a specific ontology. As searching for relevant concepts from a large number of We can calculate the distance (which is a symmetric mea- ontologies is a time-consuming task, one goal of Semantic sure) between two ontologies (D) in different ways (see Fields is to reduce the number of candidate ontologies to offer for example the second formula) depending on the distance the user a good solution within a reasonable period of time. concept used. These measures allow us to establish how In order to build Semantic Fields, namely to find relation- similar two ontologies are, and in this way to know whether it ships between ontologies, we firstly compute the similarity will be necessary to include the already calculated similarities searches in the ontology hierarchy for parents of the given concept, returning a set of vectors with concept, ontology and similarity value. Since the search of hierarchical relationships is a time consuming task, the hierarchy is precalculated from the OWL file by means of the Jena2 parser and stored in an internal database of SemFiT; • searchChildrenForConcept(string ontologyU RI, string concept, int sessionId) searches in the ontology hierarchy for children of the given concept, returning a set of vectors with concept, Fig. 1. Semantic Field Calculation. ontology and similarity value. This method makes use of the pre-calculated hierarchy as well. between their concepts to solve the problem of determining B. Computing intra-service dependencies the user Semantic Field. As described above, an intra-service dependency is a hyper- Let us suppose that we have the set of registered ontologies edge connecting two sets of nodes, respectively representing in Figure 1, in which the edge values mean the already the inputs and the outputs of a service. Yet, a service may calculated distance between pairs of ontologies. Let us assume behave in different ways and features different functionalities. that the user selects the EconomyDomain ontology, which Consider, for instance, a simple service s which inputs a and covers his intended domain, and sets up a distance threshold produces as output b or c (i.e., a choice process). s can behave of 2. In this example the Semantic Field will be composed in two different ways, namely, as a service s1 which inputs of the ontologies at a distance less than this value, which are a and yields b and as a service s2 which inputs a and yields those included in the dotted rectangle of Figure 1, i.e., event, c. Hence s exposes two different profiles, which correspond hotel and e-commerce. to the different intra-service dependencies ({a}, {b}, s1 ) and Semantic Fields have been implemented as a configurable ({a}, {c}, s2 ). tool1 (SemFiT) which can be used directly through a Web The intra-service dependencies of an OWL-S described ser- Service or Web forms for registered users. The Semantic Field vice can be determined by analysing its process model, which Tool suitably configured to be employed by the matchmaking details the complete behaviour of the service. We present next algorithm provides the following methods: the recursive function DataDep, initially invoked over the root • startSession() composite process of a service s, which determines all the returns an identifier for the user session, which will be intra-service dependencies (I, O) of s. necessary in other methods; • DataDep(atomic(A)) = {(A.inputs, A.outputs)}; • insertOntology(string ontologyU RI, int sessionId) • DataDep(ifThenElse(P1 , P2 )) inserts a new ontology (if not already present) in the = {(I1 , O1 )|(I1 , O1 ) ∈ DataDep(P1 )} ∪ user session. This process includes the calculation of the {(I2 , O2 )|(I2 , O2 ) ∈ DataDep(P2 )} mappings and distances among the inserted ontology and S • DataDep(choice(P1 , ..., Pk )) = ki=1 {(I, O)|(I, O) ∈ DataDep(Pi )} all the ontologies previously inserted. This calculation • DataDep(sequence(P1 , ..., Pk )) implies the use of the framework for ontology matching, = DataDep(split(P1 , ..., Pk )) and creating a set of similarity matrixes which contain = DataDep(split+join(P1 , ..., Pk )) the similarity between concepts of pairs of ontologies (a value between 0 and 1 indicating the probability of two S S = DataDep(any-order(P1 , ..., Pk )) = {( ki=1 Ii , ki=1 Oi )|(Ii , Oi ) ∈ DataDep(Pi )} concepts of being the same semantic concept); • DataDep(iterate(P )) = DataDep(choice(P , sequence(P, P ), ..., • getOntologies(int sessionId) returns the ontologies which belong to the user session; | {z } sequence(P, ..., P ))) Alt(P ) • searchEquivalentConcepts(string concept, string ontologyU RI, float minSimilarity, int sessionId) Before defining Alt(P ), we need to specify some assump- taking into account the user Semantic Field (the ontolo- tions. The proposed algorithm for discovering semantic Web gies in which he/she is interested in) this method analyses services works on ontology types. As a consequence, the the similarity matrixes (between pairs of ontologies in algorithm checks whether an output (i.e., a type) can be his/her Semantic Field) for finding relationships between produced by some available service, while it does not take concepts. Concepts with a similarity value greater that care of how many times such output can be produced. This minSimilarity are considered equivalent concepts; assumption allows us to expand the iterate process into a • searchParentsForConcept(string ontologyU RI, choice of sequences, where each sequence is a possible iterated string concept, int sessionId) execution of P . Alt(P ), which is defined below, computes how 1 available at http://khaos.uma.es/semanticfields/ 2 http://jena.sourceforge.net/ many times the process P has to be executed in order to yield 6. forall concept a in searchParentsForConcept(O, c, sessionID) do 7. if a ∈ V then all the outputs producible by its atomic process children. 8. if ∃(D, {a}, nil) ∈ E⊂ then D = D ∪ {c}; • Alt(atomic(A)) = 1 9. else Add ({c}, {a}, nil) ∈ E⊂ to E; • Alt(ifThenElse(P1 , P2 )) = Alt(P1 ) + Alt(P2 ) 10. forall concept b in searchChildrenForConcept(O, c, sessionID) do 11. if b ∈ V then • Alt(choice(P1 , ..., Pn )) = Alt(P1 ) + ... + Alt(Pn ) 12. if ∃(D, {c}, nil) ∈ E⊂ then D = D ∪ {b}; • Alt(sequence(P1 , ..., Pn )) = Alt(any-order(P1 , ..., Pn )) 13. else Add ({b}, {c}, nil) ∈ E⊂ to E; = Alt(split(P1 , ..., Pn )) = Alt(split+join(P1 , ..., Pn )) 14. forall e, similarity in searchEquivalentsForConcept(c, O, = max(Alt(P1 ), ..., Alt(Pn )) 15. minSimilarity, sessionID) do 16. if e ∈ V then • Alt(iterate(P )) = 0 17. Add ({c}, {e}, similarity) ∈ E≡ to E; Let us consider, for example, the HotelService, whose 18. Add ({e}, {c}, similarity) ∈ E≡ to E; 19. forall pairs (In , On ) in DataDep(Root(s)) do root is a repeat-until composite process, illustrated 20. Add(In , On , sn ) ∈ ES to E; in Figure 2. Alt(chooseOperation) returns two, indeed, chooseOperation has to be executed twice in order For each ontology O employed by s (line 2), the AddSer- to yield all the outputs producible by its children. Hence, vice inserts O (if not already present) into the Semantic Field DataDep(HotelService) returns three profiles: H1 , which identified by sessionID (line 3). Next, for each concept c ∈ O inputs {state, city} and produces {infoHotel}, (line 4), it adds c to the hypergraph (line 5) and determines H2 , which inputs {hotel, beginDate, lastDate, the (possibly empty) sets of the parents and children of c creditCard, contactInformation} and produces by invoking the searchParentsForConcept and searchChil- {accomodationFee, hotelInvoice}, and H3 , which drenForConcept methods of the SemFiT. Then, for each inputs {state, city, hotel, beginDate, lastDate, parent a of c (line 6), AddService adds (if a ∈ V ) a E⊂ creditCard, contactInformation} and produces hyperedge (c, a, nil) to E (lines 7–9), while for each child {infoHotel, accomodationFee, hotelInvoice}. concept b of c (line 10), it inserts (if b ∈ V ) a E⊂ hyperedge Note that DataDep is a recursive function and that (b, c, nil) to E (lines 11–13). Next, it determines the (possible if an iterate process has an iterate-typed ances- empty) set of the equivalent concepts of c by invoking the tor Q, DataDep first expands Q and then P . That searchEquivalentsForConcept method of SemFiT, which is why we set Alt(iterate(P )) = 0 in the pseudo- returns those concepts above a given similarity threshold, i.e., code above. It is also worth observing that the defini- the minSimilarity parameter. For each equivalent concept e of tion of DataDep(iterate(P )) is suitable when P has c (lines 14–15) AddService adds to E (if e ∈ V ) the two E≡ to be executed at least one (i.e., iterate instantiated as hyperedges (c, e, similarity) and (e, c, similarity) (lines 16– repeat-until) as well as when P may be skipped (i.e., 18). Note that all the concepts in the ontologies employed by s iterate instantiated as repeat-while). In the latter case, will be included in the hypergraph, independently of whether the definition of DataDep(iterate(P )) has to be expanded they directly occur in the specification of (some process in) s. by adding a nop branch to the choice among all possible iterate Finally, AddService inserts to the hypergraph a ES hyperedge executions of P . (In , On , sn ) for each intra-service dependency returned by DataDep (namely, for each profile n of s) (lines 19–20). C. Building the hypergraph D. Example After describing how data and service dependencies are determined, we can introduce in this subsection the AddSer- We present next an example which illustrates the behaviour vice function, which updates the hypergraph whenever a new of the AddService function. Let us consider an empty reg- service s is added to the registry. istry where we want to add two OWL-S described services: Generally speaking, AddService firstly adds to the hyper- HotelService, which allows a client to search for and/or to graph the concepts defined in the ontologies employed by s reserve hotels, and ConferenceService, which allows a client and next, it draws the hyperedges representing the subCon- to register to academic events. Their process models, which ceptOf relationships, the equivalentConceptOf relationships employ three different ontologies (hotel, e-commerce and and the intra-service dependencies between the newly added event), are shown in Figure 2. ontology concepts. Note that AddService does not affect Let us consider first the HotelService. AddService in- the efficiency of the matching process, as the hypergraph serts to the hypergraph all the concepts defined in the construction is completely query independent and can be pre- hotel and e-commerce ontologies (the light gray and computed off-line before query time. dark gray ellipses in Figure 3) together with the subCon- The behaviour of AddService can be summarised by the ceptOf and the equivalentConceptOf relationships returned by following pseudo-code, where sessionID is the session num- SemFiT. Note, for example, the subConceptOf relationships ber assigned to AddService by the startSession method of between hotel#city and hotel#physicalLocation, SemFiT. and between e-commerce#creditCard and e-com- 1. AddService(hypergraph H = (V, E), service s, int sessionID) merce#paymentMethods as well as the equivalentCon- 2. forall ontology O ∈ / getOntologies(sessionID) referred by s do ceptOf relationships between e-commerce#invoice and 3. insertOntology(O, sessionID); 4. forall concept c in O do hotel#invoice. At this point, the AddService inserts also 5. Add c to V ; the intra-service dependencies of HotelService computed by Fig. 2. Process models of the HotelService and ConferenceService. the DataDep function, i.e., the ES hyperedges H1 , H2 and {o|∃(I, O, s) ∈ ES ∧ o ∈ O}. We also remind you that 7→ H3 in Figure 3. Next, also ConferenceService is added to denotes the extended subConceptOf relationship. the registry. AddService inserts to the hypergraph the con- 1. QuerySolver(hypergraph H, query Q, set composition, cepts defined in the event ontology, the subConceptOf and 2. set neededOutput, set availableOutput) 3. if (neededOutput = ∅) then return composition equivalentConceptOf relationships computed by SemFiT, as 4. else well as the intra-service dependencies of ConferenceService. 5. out = extract(neededOutput); 6. S = {s|out ∈ Output(s) ∨ (∃c ∈ Output(s)|c 7→ out)}; The resulting hypergraph is shown in Figure 3. 7. if (S = ∅) then fail It is worth observing that some of the equivalentConceptOf 8. else 9. forall service s in S do relationships found by SemFiT appear due to the similarity 10. composition0 = composition ∪ {s}; between the syntax of the compared concepts (e-commer- 11. availableOutput0 = availableOutput ∪ Output(s); 12. neededOutput0 = {x|x ∈ neededOutput ∪ Input(s) ce#invoice – hotel#invoice, hotel#physical- 13. ∧ @a ∈ availableOutput0 : Location – event#location and hotel#city – ev- 14. a = x ∨ a 7→ x}; 15. QuerySolver(H, Q, composition0 , ent#city), while other mappings are the result of apply- 16. neededOutput0 , availableOutput0 ); ing searches in wordNet (e.g., event#startDate – ho- tel#beginDate and event#endDate – hotel#last- If neededOutput = ∅, i.e., there are no outputs to be Date). In this last case, the similarity does not derive from di- generated, QuerySolver returns the set composition of the rect synonyms in wordNet, but from the tokens that compose services found (line 3), which satisfies the functional client the words. Finally, event#dateTime and hotel#date query. Otherwise (line 4) QuerySolver withdraws an output are similar because of their syntax and their structure (both out from the set neededOutput (line 5) and computes the set have children that have same similarity). S of the services which produce (a sub concept of) out (line 6). If S is empty, that is, out cannot be generated by any registry- III. T HE DISCOVERY ALGORITHM published service, then QuerySolver fails (line 7) since the The discovery algorithm takes as input a client query query cannot be fulfilled. Otherwise (line 8) for each service s specifying the set of inputs and outputs of the desired service which generates (a sub concept of) out (line 9), QuerySolver (composition). As we will describe in Section IV, the formula- adds s to composition (line 10), updates availableOutput by tion of the query is eased by a suitable interface that displays adding the outputs of s (line 11), and updates neededOutput the available concepts with an expandable tree structure. Next, by adding the inputs of s and by removing the concepts that the search algorithm explores the hypergraph, by performing are now available (lines 12–14). Next, QuerySolver continues a depth-first visit, in order to discover the (compositions of) recursively (lines 15–16). services capable of satisfying the client request. The complexity of QuerySolver belongs to the N P space The discovery algorithm is synthesised by the following as it determines all the possible solutions by visiting the hy- recursive function QuerySolver which inputs five parameters: pergraph non-deterministically. We have tested QuerySolver the hypergraph H, the client query Q, the set composition on a small repository containing ten services only, as there are of the services selected so far (initially empty), the set few available SWSs on the Web as the SWS area is emerging neededOutput of the outputs to be generated (initially the now. Anyway, the average reply-time of QuerySolver is 0.2 query outputs), and the set availableOutput of the outputs seconds. We imagine that in the future many SWSs will exist, available (initially the query inputs). thus, we intend to improve efficiency by operating in the In the pseudo-code listed below, Input(s) denotes the inputs following directions: of the service s, that is, the set {i|∃(I, O, s) ∈ ES ∧ i ∈ • to reduce the large number of services taken into account I} as well as Output(s) denotes the outputs of s, namely by the QuerySolver by introducing a suitable service Fig. 3. The hypergraph after registering HotelService and ConferenceService. pre-selection phase (e.g., using UDDI to filter services not second invocation, QuerySolver withdraws e-commer- belonging to certain service categories), and by selecting ce#registrationReceipt from neededOutput, which services which produce a needed output with respect to is produced by both the profiles C1 and C2 of Conference- some heuristics (e.g., the number and/or on the quality Service. QuerySolver creates then the following composi- of the produced outputs) in order to consider the “most tions: {H2 , C1 } and {H2 , C2 }. While the first one fulfills promising” services only. the query, as all the needed outputs are now available, the • to introduce caching and indexing techniques in order to second one fails, as e-commerce#bankAccount cannot sensibly decrease the QuerySolver reply-time. be produced by any registry-published service. When all instances of QuerySolver terminate, the functional analyser A. Example returns to the client two successful compositions: {H2 , C1 } Let us continue the example introduced in subsection II- and {H3 , C1 }. D. Consider now a client wishing to plan its participation in an international conference by registering to the conference IV. I MPLEMENTATION and by booking his hotel accomodation, and receiving the In this section we discuss the implementation of our dis- conference and hotel confirmations. The client query may be covery system, that we named SAM for Service Aggregation composed by the following parameters: Matchmaking. • inputs – hotel#hotel, event#internationalConfe- rence, e-commerce#creditCard, e-commerce#con- tactInformation • output – e-commerce#invoice, e-commerce#regi- strationReceipt. As one may note, neither HotelService nor Conference- Service satisfies the given query by itself. Yet, the query can be fulfilled by suitably composing the two available services. QuerySolver takes as parameters the hypergraph depicted in Figure 3, the query inputs (i.e., availableOutput) and the query outputs (i.e., neededOutput), while composition is an empty set. Suppose that QuerySolver withdraws e-com- merce#invoice from neededOutput. QuerySolver cre- Fig. 4. The system architecture. ates two compositions, represented by the hyperedges {H2 } and {H3 } respectively, as the profiles {H2 } and {H3 } of Ho- Figure 4 presents the high-level architecture of SAM. telService both produce hotel#hotelInvoice, which is AddService and QuerySolver are Java applications which a subConceptOf e-commerce#invoice (i.e., the former is implement the AddService function (Subsection II-C) and linked to the latter by means of E⊂ and E≡ hyperedges). Con- the discovery algorithm (Section III). AddService makes use sider composition {H2 }. QuerySolver adds to neededOutput of the CMU OWL-S API3 for parsing OWL-S descriptions. hotel#beginDate and hotel#lastDate, since they do not belong to the query inputs. Suppose that, at its 3 http://www.daml.ri.cmu.edu/owlsapi/ AddService and QuerySolver both access the database, which with different ontologies. Still, the ontology crossing task is is the local repository where SAM stores service descriptions, computed at query answering time, hence severely affecting ontologies and the dependency hypergraph (Section II). A the efficiency of the discovery algorithm. An efficient match- JDBC-ODBC bridge allows the Java applications to access ing of semantic Web service capabilities is the goal of [13], the repository, implemented as a relational (MS Access, for which achieves efficiency by pre-processing the available on- this first prototype) database. It is worth noting that we do tologies and by pre-classifying the registry-published services not introduce Java classes to represent ontologies, profiles, before query time. Still, [13] does not address the discovery concepts and dependencies in order to avoid loading in mem- of service compositions. ory space-consuming Java objects. Indeed, AddService and Moreover, it is worth mentioning the METEOR-S project QuerySolver perform ontological reasoning and hypergraph [19], whose objective is the realisation of a framework for handling by means of database invocations. As described in annotating, discovering and composing semantic Web services. Subsections II-A and II-C, the Java Semantic Field Application Yet, METEOR-S is a semi-automated approach requiring a (i.e., SemFiT) supports AddService to process ontologies and strong participation of the user, which is highly involved in data dependencies. the process of semi-manually discovering and/or composing Finally, we have provided SAM with a suitable Web in- services. terface, which allows clients to submit a new service to the As composing services requires to cope with different system as well as to query the discovery algorithm. Figure 5 ontologies, several other works have been recently proposed shows a screenshot of the system interface. In its left panel to address the issues of ontology merging and alignment. For the interface provides a tree view of the available ontologies instance, it is worth mentioning [11], [15], which exploit the to help the client in specifying its query. The input and output knowledge provided by matching tools that find similarities concepts selected by the client are shown in the query inputs between concepts interpreted as lexical entries, [10], which and query outputs panels on the right part of the interface. makes use of classes labels to compare the taxonomies of By clicking on the Submit query button, the client queries two ontologies, and [20], which does not analyse the classes the discovery algorithm which returns the result in the bottom properties, but takes into account the ontology instances to panel of the interface. Moreover, the client can submit a new related different classes. In order to achieve better results, service to the system by means of the Submit service button, SemFiT developed a framework for ontology crossing which after providing the service URL. takes advantages of different matching algorithms, as the AddService and QuerySolver are Java servlets which run on combination of simple matchers has demonstrated its viability Apache Tomcat. To develop the Web interface we employed in several approaches such as the COMA Framework [9]. the Google Web Toolkit, which is a Java development frame- work that allows to deploy AJAX applications and complex VI. C ONCLUDING REMARKS Web interfaces by using the Java language. Google Web Toolkit then translates the Java code to browser-compliant In this paper, we have presented a new fully-automated JavaScript and HTML. matchmaking system for discovering (compositions of) OWL- Figure 5 catches the Web interface as it appears af- S semantic Web services capable of satisfying a given client ter submitting the query described in Subsection III-A. In- request. The proposed system is the first one – at the best deed, the results panel lists the two successful composi- of our knowledge – that addresses three main issues of tions which satisfy the example query. Note the result syn- the semantic service discovery domain, namely, composition- tax [S1 (P11 , . . . , P1n1 ), . . . , Sn (Pn1 , . . . , Pnnn )] where Si de- oriented discovery, crossing different ontologies and efficiency. notes the service name and Pi1 , . . . , Pini denote the atomic We have also presented a first Web-accessible prototype of our processes of the selected profile of Si . discovery system. Our plan for future work includes the development of fresh V. R ELATED WORK indexing and/or ranking techniques (as search engines do The discovery of semantic Web services has been firstly for Web pages) in order to sensibly improve the efficiency addressed by Paolucci et al. in [17], where they proposed an of our system. A second line of our future work is to algorithm performing a functionality matching between ser- enhance our discovery system by employing a behavioural vice requests and service advertisements described as DAML- analyser module, which analyses the behavioural information S (the predecessor of OWL-S) service profiles. Yet, [17] does advertised in the (OWL-S) service descriptions to determine not deal with different ontologies, nor it considers service whether the candidate services (selected by the functional composition, which instead is the goal of [3], [12], [8]. analyser) can really be composed together and satisfy the These approaches discover compositions of OWL-S described query without dead-locking, in the line of [5]. Finally, we plan services by operating in the domain of hypergraphs, finite state to extend SemFiT by employing other existing matching tools automata and interface automata, respectively. However, [3], to achieve better results for the ontology matching problem. [12], [8] do not address the task of crossing ontologies. Our long-term goal is to develop a well-founded methodology In [2] Aversano et al. extended [17] by proposing a to support an efficient and fully-automated discovery and composition-oriented discovery algorithm capable of coping composition of Web services. Fig. 5. The Web interface of SAM. R EFERENCES [12] S. B. Mokhtar, N. Georgantas, and V. Issarny, “Ad Hoc Composition of User Tasks in Pervasive Computing Environment,” in Software [1] J. F. Aldana-Montes, I. Navas-Delgado, and M. del Mar Roldan-Garcia, Composition, LNCS 3628, T. Gschwind, U. Aßmann, and O. Nierstrasz, “Solving Queries over Semantically Integrated Biological Data Sources,” Eds. Springer-Verlag, 2005. in Int. Conf. on Web-Age Information Management. LNCS 3129, 2004. [13] S. B. Mokhtar, A. Kaul, N. Georgantas, and V. Issarny, “Towards Effi- [2] L. Aversano, G. Canfora, and A. Ciampi, “An Algorithm for Web cient Matching of Semantic Web Service Capabilities,” in Proceedings Service Discovery through Their Composition,” in IEEE International of WS-MATE 2006, 2006. Conference on Web Services (ICWS’04), L. Zhang, Ed. IEEE Computer [14] I. Navas-Delgado, I. Sanz, J. F. Aldana-Montes, and R. Berlanga, Society, 2004, pp. 332–341. “Automatic Generation of Semantic Fields for Resource Discovery in [3] B. Benatallah, M.-S. Hacid, C. Rey, and F. Toumani, “Request the Semantic Web,” in 16th Int. Conf. on Database and Expert Systems Rewriting-Based Web Service Discovery,” in The Semantic Web - ISWC Applications. LNCS 3588, 2005. 2003, LNCS 2870, G. Goos, J. Hartmanis, and J. van Leeuwen, Eds. [15] N. Noy and M. Musen, “Prompt: Algorithm and Tool for Automated Springer-Verlag, 2003, pp. 242–257. Ontology Merging and Alignment,” in Proc. of the Nat. Conf. on [4] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” in Artificial Intelligence, 2000. Scientific American, 2001. [16] OWL-S Coalition, “OWL-S 1.1,” 2004, http://www.daml.org/services/ [5] A. Brogi and S. Corfini, “Behaviour-aware discovery of Web service owl-s/1.1/. compositions,” in University of Pisa, Department of Computer Science [17] M. Paolucci, T. Kawamura, T. Payne, and K. Sycara, “Semantic Match- - Tech. Rep. TR-06-08, June 2006. making of Web Services Capabilities,” in 1th Int. Conf. on The Semantic [6] A. Brogi, S. Corfini, J. Aldana, and I. Navas, “Automated Discovery Web, LNCS 2342, I. Horrocks and J. Hendler, Eds. Springer-Verlag, of Compositions of Services Described with Separate Ontologies,” in 2002, pp. 333–347. ICSOC 2006. LNCS 4294, A. Dan and W. Lamersdorf, Eds. Springer- [18] R. Al-Halimi et al., WordNet - An Electronic Lexical Database, C. Fell- Verlag, 2006, pp. 509–514. baum, Ed. MIT Press, 1998. [7] G. Gallo, G. Longo, S. Nguyen, and S. Pallottino, “Directed hypergraphs [19] P. Rajasekaran, J. A. Miller, K. Verma, and A. P. Sheth, “Enhancing and applications,” Discrete Applied Mathematics, vol. 42, no. 2, pp. 177– Web Services Description and Discovery to Facilitate Composition,” 201, 1993. in Semantic Web Services and Web Process Composition, LNCS 3387, [8] S. Hashemian and F. Mavaddat, “A Graph-Based Approach to Web J. Cardoso and A. Sheth, Eds. Springer-Verlag, 2005, pp. 55–68. Services Composition,” in SAINT 2005, IEEE Computer Society, Ed. [20] G. Stumme and A. Madche, “Fca-merge: Bottom-up Merging of Ontolo- CS Press, 2005, pp. 183–189. gies,” in In 7th Intl. Conf. on Artificial Intelligence (IJCAI ’01), 2001, [9] D. Hong-Hai and R. Herhard, “COMA - A System for Flexible Combi- pp. 225–230. nation of Schema Matching Approches,” in Proc. of the 28th Int. Conf. [21] W3C, “The UDDI Technical White Paper,” 2000, http://www.uddi.org/. on Very Large Databases (VLDB), Hongkong, 2002. [22] ——, “Simple Object Access Protocol (SOAP) 1.2, W3C working draft, [10] A. Maedche and S. Staab, “Measuring Similarity between Ontologies,” 17 December 2001,” 2001, http://www.w3.org/TR/2001/WD-soap12- in Proc. Of the European Conference on Knowledge Acquisition and part0-20011217/. Management - EKAW-2002, LNCS/LNAI 2473, 2002, pp. 251–263. [23] ——, “Web Service Description Language (WSDL) 1.1,” 2001, [11] D. McGuinness, R. Fikes, J. Rice, and Wilder, “The Chimaera Ontology http://www.w3.org/TR/wsdl. Environment,” in Proc. of the 17th Nat. Conf. on Artificial Intelligence, 2000, pp. 1123–1124.