A Prototype for Discovering Compositions of
                   Semantic Web Services?
                                Antonio Brogi∗ , Sara Corfini∗ , José F. Aldana† , Ismael Navas†
                                     ∗ Department of Computer Science – University of Pisa, Italy

                                            Email: {brogi,corfini}@di.unipi.it
                   † Department of Computer Languages and Computing Science – University of Málaga, Spain

                                               Email: {jfam,ismael}@lcc.uma.es


   Abstract— Semantic Web services provide semantically en-              typically described in terms of different ontologies. Another
hanced service descriptions which allow the development of               important feature is the capability of discovering service
automated service discovery and composition. Yet, an efficient           compositions rather than single services. Indeed it is often
semantic-based service discovery remains a major open challenge
towards a wide acceptance of semantic Web services. In this              the case that a client query cannot be fulfilled by a single
paper we present a fully-automated matchmaking system for                service, while it may be fulfilled by a suitable composition
discovering sets of OWL-S Web services capable of satisfying a           of services. Last, but not least, efficiency is obviously an
given client request.                                                    important objective of service discovery mechanisms.
                                                                            In this perspective, we described in [6] the design of a
                        I. I NTRODUCTION
                                                                         composition-oriented methodology for discovering OWL-S
   Web services are internet-based, self-describing and plat-            SWSs. Such methodology employs the notion of Semantic
form-independent application components published using                  Field [14] to cross different ontologies and it achieves ef-
standard interface description languages (WSDL [23]) and                 ficiency by pre-computing off-line all the query-independent
universally available via standard communication protocols               tasks, viz., to determine the dependencies within/among ser-
(SOAP [22]). Web services are emerging as the new funda-                 vices as well as the relationships among ontologies. An hyper-
mental elements for developing complex software applications.            graph is employed to collect all the computed dependencies.
However, a prerequisite to reusing and composing Web ser-                The search algorithm of [6] analyses the hypergraph in order
vices is the ability to find the right service(s).                       to discover the sets of services (candidated to be composed)
   The only accepted standard for service discovery is UDDI              capable of satisfying a given client request, which specifies
[21], which performs a keywords-based search on registries               inputs and outputs of the desired service.
of WSDL services. One of the main limitations of the current                In this paper we develop the methodology proposed in [6]
service standards is that they do not provide an explicit seman-         by introducing the algorithms needed to pre-process ontolo-
tics. Indeed, two identical WSDL documents may describe two              gies, to pre-compute service dependencies as well as to dis-
completely different services. A promising approach seems                cover service compositions. We also present a first prototype
to be semantically modelling service capabilities in order to            of our system which interacts with the clients by means of a
enable a discovery based on what the services really provide.            suitable Web interface.
The application of the Semantic Web [4] model to the Web                    The rest of the paper is organised as follows. Sections II and
service area has recently promoted the development of so-                III are devoted to present the discovery system. In Section II
called Semantic Web Services. A semantic Web service (SWS)               we define the data structure we use to collect information about
is a software component which self-describes its functionali-            ontology concepts, services and their dependencies, while in
ties by annotating them with (instances of) concepts formally            Section III we describe the search algorithm for discovering
defined by means of ontologies. One of the most promising                Web service compositions. A first prototype of the discovery
languages for describing SWSs is the Ontology Web Language               system is presented in Section IV. Finally, related work is
for Web Services (OWL-S [16]), which provides each service               discussed in Section V, while some concluding remarks are
with three documents: the service profile (“what the service             drawn in Section VI.
does”), the process model (“how the service works”) and the
service grounding (“how to access the service”).
                                                                                     II. T HE DEPENDENCY HYPERGRAPH
   The development of semantics-based service discovery
mechanisms constitutes a major open challenge in this context,             In order to store the knowledge derived from the pre-
and it raises several important issues. One of them is the ability       processing of ontologies and service descriptions, we need a
of coping with different ontologies, as different services are           data structure capable of suitably representing service and data
  ? Work partially supported by the SMEPP project (EU-FP6-IST 0333563)
                                                                         dependencies. The hypergraph seems to be a good candidate
and by the project TIN2005-09098-C05-01 (Spanish Ministry of Education   as dependencies can be naturally modelled by means of
and Science).                                                            hyperedges.
   According to [7], a directed hypergraph H = (V, E) is a            between pairs of concepts and next we calculate the distance
pair, where V is a finite set of vertices and E is a set of           between pairs of ontologies (i.e., how similar two ontologies
directed hyperedges. A directed hyperedge is an ordered pair          are). Mappings between concepts can be determined by means
(X, Y ) of (possible empty) disjoint subsets of V , where X and       of different ontology matching tools, which check whether two
Y denote the tail and the head of the hyperedge, respectively.        concepts are equivalent with a given probability.
   In this context, the vertices of the hypergraph correspond            Since the results depend on the employed matching tools,
to the concepts defined in the ontologies employed by the             we have hence developed a framework for adapting existent
analysed service descriptions, while the hyperedges represent         matching tools and combine them in order to improve the
relationships among such concepts. More precisely, an hyper-          obtained results. In the framework instance used in this paper
edge has one of the following three types:                            we combine the results of individual matchers which analy-
   • E⊂ = (D, {c}, nil) – subConceptOf relationship. Let c            ses the similarity between pairs of concepts with different
      be a concept defined in an ontology O and let D ∈ O be          strategies: (1) name similarity: compares the names of the
      the set of the (direct) subconcepts of c. Then, there is a      provided concepts, (2) data type similarity: compares the
      E⊂ hyperedge from D to c.                                       types of the provided concepts, and establishes a similarity
   • E≡ = ({e}, {f }, sim) – equivalentConceptOf relation-            value depending on the possibility of producing each type
      ship. Let e, f be two concepts defined in two separate          from another type by casting, (3) WordNet similarity: checks
      ontologies and let sim be the similarity between e and          whether the two concepts are synonyms in WordNet [18], (4)
      f , i.e., the probability that e is (semantically) equivalent   path similarity: compares the path from each concept to its
      to f . If sim is above a given similarity threshold, there      root in the ontology, and provides an average between the path
      is a E≡ hyperedge from e to f labelled by sim.                  length similarity and the name similarity between all concepts
   • ES = (I, O, s) – intra-service dependency. Let s be (a           in both paths, (5) property similarity: compares the similarities
      profile of) a service and let I be the set of inputs that s     among the properties of the concepts compared, and (6) edit
      requires to produce the set O of outputs. Then, there is        distance similarities: this family of algorithms is based on the
      a ES hyperedge from I to O labelled by s.                       comparison of strings, and calculates the difference between
   It is worth noting that the proposed hypergraph automati-          them as the operations that should be performed on one string
cally models other important aspects regarding the registry-          to obtain another. As these algorithms calculate the distance
published services and ontologies. The hypergraph models              between two concepts, it is necessary to calculate the similarity
an extended notion of subConceptOf between two concepts               from the value provided.
c, d defined in two given ontologies. More specifically, c is            The calculus of mappings is done between pairs of ontolo-
subConceptOf d (hereafter, c 7→ d) if and only if there exists a      gies, hence if we have N ontologies then we have to calculate
path from c to d which consists of subConceptOf relationships         N 2 − N matrixes. Assuming that the mapping between two
(E⊂ hyperedges) and/or equivalentConceptOf relationships              ontologies is a symmetric relationship we need to calculate
(E≡ hyperedges). Moreover, the hypergraph directly repre-             (N 2 − N )/2 matrixes. Yet, our tool calculates the similarity
sents the inter-service dependencies. Indeed, there is a inter-       matrixes between the new ontology and all the previously
service dependency between two services s and t if the head           published ontologies when the new one is registered. Thus,
of a s-labelled ES hyperedge and the tail of a t-labelled ES          each insertion only requires N − 1 calculations.
hyperedge share (at least) a concept. Note that there is a inter-        Once the mappings between two different ontologies are
service dependency between s and t also if there exist two            established, the distance between the ontologies is calculated
concepts cs and ct such that cs 7→ ct , cs belongs to the head        by making use of some ontology distance measurements.
of the s-labelled ES hyperedge and ct belongs to the tail of          Given a matrix with the similarity between pairs of concepts
the t-labelled ES hyperedge.                                          of two ontologies (O1 and O2 ), we can make use of several
   In the following subsections we describe how subConceptOf          measurements based on the definitions of distance in different
relationships, equivalentConceptOf relationships and intra-           contexts. First, we define the directed distance (DD) between
service dependencies can be determined.                               them (as it is an asymmetric measure):
A. The SemFiT application                                                                           #Concepts(O1 )
                                                                       DD(O1 , O2 ) = P
   Semantic Fields [14], [1] are groups of interrelated ontolo-                            c∈Concepts(O1 ) max(mappings(c, O2 ))
gies that may be relevant to a given information request, in
which the client needs to find related ontologies, from a set of             D(O1 , O2 ) = min(DD(O1 , O2 ), DD(O2 , O1 ))
ontologies, for a given set of concepts of a specific ontology.
As searching for relevant concepts from a large number of                We can calculate the distance (which is a symmetric mea-
ontologies is a time-consuming task, one goal of Semantic             sure) between two ontologies (D) in different ways (see
Fields is to reduce the number of candidate ontologies to offer       for example the second formula) depending on the distance
the user a good solution within a reasonable period of time.          concept used. These measures allow us to establish how
   In order to build Semantic Fields, namely to find relation-        similar two ontologies are, and in this way to know whether it
ships between ontologies, we firstly compute the similarity           will be necessary to include the already calculated similarities
                                                                        searches in the ontology hierarchy for parents of the given
                                                                        concept, returning a set of vectors with concept, ontology
                                                                        and similarity value. Since the search of hierarchical
                                                                        relationships is a time consuming task, the hierarchy is
                                                                        precalculated from the OWL file by means of the Jena2
                                                                        parser and stored in an internal database of SemFiT;
                                                                     • searchChildrenForConcept(string ontologyU RI,
                                                                          string concept, int sessionId)
                                                                        searches in the ontology hierarchy for children of the
                                                                        given concept, returning a set of vectors with concept,
                Fig. 1.   Semantic Field Calculation.                   ontology and similarity value. This method makes use of
                                                                        the pre-calculated hierarchy as well.

between their concepts to solve the problem of determining         B. Computing intra-service dependencies
the user Semantic Field.
                                                                      As described above, an intra-service dependency is a hyper-
   Let us suppose that we have the set of registered ontologies
                                                                   edge connecting two sets of nodes, respectively representing
in Figure 1, in which the edge values mean the already
                                                                   the inputs and the outputs of a service. Yet, a service may
calculated distance between pairs of ontologies. Let us assume
                                                                   behave in different ways and features different functionalities.
that the user selects the EconomyDomain ontology, which
                                                                   Consider, for instance, a simple service s which inputs a and
covers his intended domain, and sets up a distance threshold
                                                                   produces as output b or c (i.e., a choice process). s can behave
of 2. In this example the Semantic Field will be composed
                                                                   in two different ways, namely, as a service s1 which inputs
of the ontologies at a distance less than this value, which are
                                                                   a and yields b and as a service s2 which inputs a and yields
those included in the dotted rectangle of Figure 1, i.e., event,
                                                                   c. Hence s exposes two different profiles, which correspond
hotel and e-commerce.
                                                                   to the different intra-service dependencies ({a}, {b}, s1 ) and
   Semantic Fields have been implemented as a configurable
                                                                   ({a}, {c}, s2 ).
tool1 (SemFiT) which can be used directly through a Web
                                                                      The intra-service dependencies of an OWL-S described ser-
Service or Web forms for registered users. The Semantic Field
                                                                   vice can be determined by analysing its process model, which
Tool suitably configured to be employed by the matchmaking
                                                                   details the complete behaviour of the service. We present next
algorithm provides the following methods:
                                                                   the recursive function DataDep, initially invoked over the root
  • startSession()                                                 composite process of a service s, which determines all the
     returns an identifier for the user session, which will be     intra-service dependencies (I, O) of s.
     necessary in other methods;
                                                                     • DataDep(atomic(A)) = {(A.inputs, A.outputs)};
  • insertOntology(string ontologyU RI, int sessionId)               • DataDep(ifThenElse(P1 , P2 ))
     inserts a new ontology (if not already present) in the            = {(I1 , O1 )|(I1 , O1 ) ∈ DataDep(P1 )} ∪
     user session. This process includes the calculation of the          {(I2 , O2 )|(I2 , O2 ) ∈ DataDep(P2 )}
     mappings and distances among the inserted ontology and               S
                                                                     • DataDep(choice(P1 , ..., Pk ))
                                                                       = ki=1 {(I, O)|(I, O) ∈ DataDep(Pi )}
     all the ontologies previously inserted. This calculation
                                                                     • DataDep(sequence(P1 , ..., Pk ))
     implies the use of the framework for ontology matching,           = DataDep(split(P1 , ..., Pk ))
     and creating a set of similarity matrixes which contain           = DataDep(split+join(P1 , ..., Pk ))
     the similarity between concepts of pairs of ontologies (a
     value between 0 and 1 indicating the probability of two
                                                                              S      S
                                                                       = DataDep(any-order(P1 , ..., Pk ))
                                                                       = {( ki=1 Ii , ki=1 Oi )|(Ii , Oi ) ∈ DataDep(Pi )}
     concepts of being the same semantic concept);                   • DataDep(iterate(P ))
                                                                       = DataDep(choice(P , sequence(P, P ), ...,
  • getOntologies(int sessionId)
     returns the ontologies which belong to the user session;                                   | {z }
                                                                                     sequence(P, ..., P )))
                                                                                                 Alt(P )
  • searchEquivalentConcepts(string concept, string
       ontologyU RI, float minSimilarity, int sessionId)              Before defining Alt(P ), we need to specify some assump-
     taking into account the user Semantic Field (the ontolo-      tions. The proposed algorithm for discovering semantic Web
     gies in which he/she is interested in) this method analyses   services works on ontology types. As a consequence, the
     the similarity matrixes (between pairs of ontologies in       algorithm checks whether an output (i.e., a type) can be
     his/her Semantic Field) for finding relationships between     produced by some available service, while it does not take
     concepts. Concepts with a similarity value greater that       care of how many times such output can be produced. This
     minSimilarity are considered equivalent concepts;             assumption allows us to expand the iterate process into a
  • searchParentsForConcept(string ontologyU RI,                   choice of sequences, where each sequence is a possible iterated
       string concept, int sessionId)                              execution of P . Alt(P ), which is defined below, computes how

  1 available at http://khaos.uma.es/semanticfields/                 2 http://jena.sourceforge.net/
many times the process P has to be executed in order to yield            6.          forall concept a in searchParentsForConcept(O, c, sessionID) do
                                                                         7.             if a ∈ V then
all the outputs producible by its atomic process children.               8.                if ∃(D, {a}, nil) ∈ E⊂ then D = D ∪ {c};
  • Alt(atomic(A)) = 1                                                   9.                else Add ({c}, {a}, nil) ∈ E⊂ to E;
  • Alt(ifThenElse(P1 , P2 )) = Alt(P1 ) + Alt(P2 )                     10.          forall concept b in searchChildrenForConcept(O, c, sessionID) do
                                                                        11.             if b ∈ V then
  • Alt(choice(P1 , ..., Pn )) = Alt(P1 ) + ... + Alt(Pn )              12.                if ∃(D, {c}, nil) ∈ E⊂ then D = D ∪ {b};
  • Alt(sequence(P1 , ..., Pn )) = Alt(any-order(P1 , ..., Pn ))        13.                else Add ({b}, {c}, nil) ∈ E⊂ to E;
    = Alt(split(P1 , ..., Pn )) = Alt(split+join(P1 , ..., Pn ))        14.          forall e, similarity in searchEquivalentsForConcept(c, O,
    = max(Alt(P1 ), ..., Alt(Pn ))                                      15.                                          minSimilarity, sessionID) do
                                                                        16.             if e ∈ V then
  • Alt(iterate(P )) = 0                                                17.                Add ({c}, {e}, similarity) ∈ E≡ to E;
   Let us consider, for example, the HotelService, whose                18.                Add ({e}, {c}, similarity) ∈ E≡ to E;
                                                                        19.   forall pairs (In , On ) in DataDep(Root(s)) do
root is a repeat-until composite process, illustrated                   20.      Add(In , On , sn ) ∈ ES to E;
in Figure 2. Alt(chooseOperation) returns two, indeed,
chooseOperation has to be executed twice in order                          For each ontology O employed by s (line 2), the AddSer-
to yield all the outputs producible by its children. Hence,             vice inserts O (if not already present) into the Semantic Field
DataDep(HotelService) returns three profiles: H1 , which                identified by sessionID (line 3). Next, for each concept c ∈ O
inputs {state, city} and produces {infoHotel},                          (line 4), it adds c to the hypergraph (line 5) and determines
H2 , which inputs {hotel, beginDate, lastDate,                          the (possibly empty) sets of the parents and children of c
creditCard, contactInformation} and produces                            by invoking the searchParentsForConcept and searchChil-
{accomodationFee, hotelInvoice}, and H3 , which                         drenForConcept methods of the SemFiT. Then, for each
inputs {state, city, hotel, beginDate, lastDate,                        parent a of c (line 6), AddService adds (if a ∈ V ) a E⊂
creditCard, contactInformation} and produces                            hyperedge (c, a, nil) to E (lines 7–9), while for each child
{infoHotel, accomodationFee, hotelInvoice}.                             concept b of c (line 10), it inserts (if b ∈ V ) a E⊂ hyperedge
   Note that DataDep is a recursive function and that                   (b, c, nil) to E (lines 11–13). Next, it determines the (possible
if an iterate process has an iterate-typed ances-                       empty) set of the equivalent concepts of c by invoking the
tor Q, DataDep first expands Q and then P . That                        searchEquivalentsForConcept method of SemFiT, which
is why we set Alt(iterate(P )) = 0 in the pseudo-                       returns those concepts above a given similarity threshold, i.e.,
code above. It is also worth observing that the defini-                 the minSimilarity parameter. For each equivalent concept e of
tion of DataDep(iterate(P )) is suitable when P has                     c (lines 14–15) AddService adds to E (if e ∈ V ) the two E≡
to be executed at least one (i.e., iterate instantiated as              hyperedges (c, e, similarity) and (e, c, similarity) (lines 16–
repeat-until) as well as when P may be skipped (i.e.,                   18). Note that all the concepts in the ontologies employed by s
iterate instantiated as repeat-while). In the latter case,              will be included in the hypergraph, independently of whether
the definition of DataDep(iterate(P )) has to be expanded               they directly occur in the specification of (some process in) s.
by adding a nop branch to the choice among all possible iterate         Finally, AddService inserts to the hypergraph a ES hyperedge
executions of P .                                                       (In , On , sn ) for each intra-service dependency returned by
                                                                        DataDep (namely, for each profile n of s) (lines 19–20).
C. Building the hypergraph
                                                                        D. Example
   After describing how data and service dependencies are
determined, we can introduce in this subsection the AddSer-                We present next an example which illustrates the behaviour
vice function, which updates the hypergraph whenever a new              of the AddService function. Let us consider an empty reg-
service s is added to the registry.                                     istry where we want to add two OWL-S described services:
   Generally speaking, AddService firstly adds to the hyper-            HotelService, which allows a client to search for and/or to
graph the concepts defined in the ontologies employed by s              reserve hotels, and ConferenceService, which allows a client
and next, it draws the hyperedges representing the subCon-              to register to academic events. Their process models, which
ceptOf relationships, the equivalentConceptOf relationships             employ three different ontologies (hotel, e-commerce and
and the intra-service dependencies between the newly added              event), are shown in Figure 2.
ontology concepts. Note that AddService does not affect                    Let us consider first the HotelService. AddService in-
the efficiency of the matching process, as the hypergraph               serts to the hypergraph all the concepts defined in the
construction is completely query independent and can be pre-            hotel and e-commerce ontologies (the light gray and
computed off-line before query time.                                    dark gray ellipses in Figure 3) together with the subCon-
   The behaviour of AddService can be summarised by the                 ceptOf and the equivalentConceptOf relationships returned by
following pseudo-code, where sessionID is the session num-              SemFiT. Note, for example, the subConceptOf relationships
ber assigned to AddService by the startSession method of                between hotel#city and hotel#physicalLocation,
SemFiT.                                                                 and between e-commerce#creditCard and e-com-
1. AddService(hypergraph H = (V, E), service s, int sessionID)
                                                                        merce#paymentMethods as well as the equivalentCon-
2.    forall ontology O ∈ / getOntologies(sessionID) referred by s do   ceptOf relationships between e-commerce#invoice and
3.       insertOntology(O, sessionID);
4.       forall concept c in O do
                                                                        hotel#invoice. At this point, the AddService inserts also
5.           Add c to V ;                                               the intra-service dependencies of HotelService computed by
                                   Fig. 2.   Process models of the HotelService and ConferenceService.


the DataDep function, i.e., the ES hyperedges H1 , H2 and              {o|∃(I, O, s) ∈ ES ∧ o ∈ O}. We also remind you that 7→
H3 in Figure 3. Next, also ConferenceService is added to               denotes the extended subConceptOf relationship.
the registry. AddService inserts to the hypergraph the con-             1. QuerySolver(hypergraph H, query Q, set composition,
cepts defined in the event ontology, the subConceptOf and               2.                               set neededOutput, set availableOutput)
                                                                        3.    if (neededOutput = ∅) then return composition
equivalentConceptOf relationships computed by SemFiT, as                4.    else
well as the intra-service dependencies of ConferenceService.            5.       out = extract(neededOutput);
                                                                        6.       S = {s|out ∈ Output(s) ∨ (∃c ∈ Output(s)|c 7→ out)};
The resulting hypergraph is shown in Figure 3.                          7.       if (S = ∅) then fail
   It is worth observing that some of the equivalentConceptOf           8.       else
                                                                        9.          forall service s in S do
relationships found by SemFiT appear due to the similarity             10.             composition0 = composition ∪ {s};
between the syntax of the compared concepts (e-commer-                 11.             availableOutput0 = availableOutput ∪ Output(s);
                                                                       12.             neededOutput0 = {x|x ∈ neededOutput ∪ Input(s)
ce#invoice – hotel#invoice, hotel#physical-                            13.                                     ∧ @a ∈ availableOutput0 :
Location – event#location and hotel#city – ev-                         14.                                     a = x ∨ a 7→ x};
                                                                       15.             QuerySolver(H, Q, composition0 ,
ent#city), while other mappings are the result of apply-               16.                                    neededOutput0 , availableOutput0 );
ing searches in wordNet (e.g., event#startDate – ho-
tel#beginDate and event#endDate – hotel#last-                             If neededOutput = ∅, i.e., there are no outputs to be
Date). In this last case, the similarity does not derive from di-      generated, QuerySolver returns the set composition of the
rect synonyms in wordNet, but from the tokens that compose             services found (line 3), which satisfies the functional client
the words. Finally, event#dateTime and hotel#date                      query. Otherwise (line 4) QuerySolver withdraws an output
are similar because of their syntax and their structure (both          out from the set neededOutput (line 5) and computes the set
have children that have same similarity).                              S of the services which produce (a sub concept of) out (line 6).
                                                                       If S is empty, that is, out cannot be generated by any registry-
             III. T HE DISCOVERY ALGORITHM                             published service, then QuerySolver fails (line 7) since the
   The discovery algorithm takes as input a client query               query cannot be fulfilled. Otherwise (line 8) for each service s
specifying the set of inputs and outputs of the desired service        which generates (a sub concept of) out (line 9), QuerySolver
(composition). As we will describe in Section IV, the formula-         adds s to composition (line 10), updates availableOutput by
tion of the query is eased by a suitable interface that displays       adding the outputs of s (line 11), and updates neededOutput
the available concepts with an expandable tree structure. Next,        by adding the inputs of s and by removing the concepts that
the search algorithm explores the hypergraph, by performing            are now available (lines 12–14). Next, QuerySolver continues
a depth-first visit, in order to discover the (compositions of)        recursively (lines 15–16).
services capable of satisfying the client request.                        The complexity of QuerySolver belongs to the N P space
   The discovery algorithm is synthesised by the following             as it determines all the possible solutions by visiting the hy-
recursive function QuerySolver which inputs five parameters:           pergraph non-deterministically. We have tested QuerySolver
the hypergraph H, the client query Q, the set composition              on a small repository containing ten services only, as there are
of the services selected so far (initially empty), the set             few available SWSs on the Web as the SWS area is emerging
neededOutput of the outputs to be generated (initially the             now. Anyway, the average reply-time of QuerySolver is 0.2
query outputs), and the set availableOutput of the outputs             seconds. We imagine that in the future many SWSs will exist,
available (initially the query inputs).                                thus, we intend to improve efficiency by operating in the
   In the pseudo-code listed below, Input(s) denotes the inputs        following directions:
of the service s, that is, the set {i|∃(I, O, s) ∈ ES ∧ i ∈              • to reduce the large number of services taken into account
I} as well as Output(s) denotes the outputs of s, namely                   by the QuerySolver by introducing a suitable service
                              Fig. 3.   The hypergraph after registering HotelService and ConferenceService.


    pre-selection phase (e.g., using UDDI to filter services not       second invocation, QuerySolver withdraws e-commer-
    belonging to certain service categories), and by selecting         ce#registrationReceipt from neededOutput, which
    services which produce a needed output with respect to             is produced by both the profiles C1 and C2 of Conference-
    some heuristics (e.g., the number and/or on the quality            Service. QuerySolver creates then the following composi-
    of the produced outputs) in order to consider the “most            tions: {H2 , C1 } and {H2 , C2 }. While the first one fulfills
    promising” services only.                                          the query, as all the needed outputs are now available, the
  • to introduce caching and indexing techniques in order to           second one fails, as e-commerce#bankAccount cannot
    sensibly decrease the QuerySolver reply-time.                      be produced by any registry-published service. When all
                                                                       instances of QuerySolver terminate, the functional analyser
A. Example                                                             returns to the client two successful compositions: {H2 , C1 }
  Let us continue the example introduced in subsection II-             and {H3 , C1 }.
D. Consider now a client wishing to plan its participation in
an international conference by registering to the conference                                 IV. I MPLEMENTATION
and by booking his hotel accomodation, and receiving the                 In this section we discuss the implementation of our dis-
conference and hotel confirmations. The client query may be            covery system, that we named SAM for Service Aggregation
composed by the following parameters:                                  Matchmaking.
  • inputs – hotel#hotel, event#internationalConfe-
    rence, e-commerce#creditCard, e-commerce#con-
    tactInformation
  • output – e-commerce#invoice, e-commerce#regi-
    strationReceipt.
   As one may note, neither HotelService nor Conference-
Service satisfies the given query by itself. Yet, the query can
be fulfilled by suitably composing the two available services.
QuerySolver takes as parameters the hypergraph depicted in
Figure 3, the query inputs (i.e., availableOutput) and the
query outputs (i.e., neededOutput), while composition is
an empty set. Suppose that QuerySolver withdraws e-com-
merce#invoice from neededOutput. QuerySolver cre-                                          Fig. 4.   The system architecture.
ates two compositions, represented by the hyperedges {H2 }
and {H3 } respectively, as the profiles {H2 } and {H3 } of Ho-            Figure 4 presents the high-level architecture of SAM.
telService both produce hotel#hotelInvoice, which is                   AddService and QuerySolver are Java applications which
a subConceptOf e-commerce#invoice (i.e., the former is                 implement the AddService function (Subsection II-C) and
linked to the latter by means of E⊂ and E≡ hyperedges). Con-           the discovery algorithm (Section III). AddService makes use
sider composition {H2 }. QuerySolver adds to neededOutput              of the CMU OWL-S API3 for parsing OWL-S descriptions.
hotel#beginDate and hotel#lastDate, since they
do not belong to the query inputs. Suppose that, at its                  3 http://www.daml.ri.cmu.edu/owlsapi/
AddService and QuerySolver both access the database, which                      with different ontologies. Still, the ontology crossing task is
is the local repository where SAM stores service descriptions,                  computed at query answering time, hence severely affecting
ontologies and the dependency hypergraph (Section II). A                        the efficiency of the discovery algorithm. An efficient match-
JDBC-ODBC bridge allows the Java applications to access                         ing of semantic Web service capabilities is the goal of [13],
the repository, implemented as a relational (MS Access, for                     which achieves efficiency by pre-processing the available on-
this first prototype) database. It is worth noting that we do                   tologies and by pre-classifying the registry-published services
not introduce Java classes to represent ontologies, profiles,                   before query time. Still, [13] does not address the discovery
concepts and dependencies in order to avoid loading in mem-                     of service compositions.
ory space-consuming Java objects. Indeed, AddService and                           Moreover, it is worth mentioning the METEOR-S project
QuerySolver perform ontological reasoning and hypergraph                        [19], whose objective is the realisation of a framework for
handling by means of database invocations. As described in                      annotating, discovering and composing semantic Web services.
Subsections II-A and II-C, the Java Semantic Field Application                  Yet, METEOR-S is a semi-automated approach requiring a
(i.e., SemFiT) supports AddService to process ontologies and                    strong participation of the user, which is highly involved in
data dependencies.                                                              the process of semi-manually discovering and/or composing
   Finally, we have provided SAM with a suitable Web in-                        services.
terface, which allows clients to submit a new service to the                       As composing services requires to cope with different
system as well as to query the discovery algorithm. Figure 5                    ontologies, several other works have been recently proposed
shows a screenshot of the system interface. In its left panel                   to address the issues of ontology merging and alignment. For
the interface provides a tree view of the available ontologies                  instance, it is worth mentioning [11], [15], which exploit the
to help the client in specifying its query. The input and output                knowledge provided by matching tools that find similarities
concepts selected by the client are shown in the query inputs                   between concepts interpreted as lexical entries, [10], which
and query outputs panels on the right part of the interface.                    makes use of classes labels to compare the taxonomies of
By clicking on the Submit query button, the client queries                      two ontologies, and [20], which does not analyse the classes
the discovery algorithm which returns the result in the bottom                  properties, but takes into account the ontology instances to
panel of the interface. Moreover, the client can submit a new                   related different classes. In order to achieve better results,
service to the system by means of the Submit service button,                    SemFiT developed a framework for ontology crossing which
after providing the service URL.                                                takes advantages of different matching algorithms, as the
   AddService and QuerySolver are Java servlets which run on                    combination of simple matchers has demonstrated its viability
Apache Tomcat. To develop the Web interface we employed                         in several approaches such as the COMA Framework [9].
the Google Web Toolkit, which is a Java development frame-
work that allows to deploy AJAX applications and complex                                        VI. C ONCLUDING REMARKS
Web interfaces by using the Java language. Google Web
Toolkit then translates the Java code to browser-compliant                         In this paper, we have presented a new fully-automated
JavaScript and HTML.                                                            matchmaking system for discovering (compositions of) OWL-
   Figure 5 catches the Web interface as it appears af-                         S semantic Web services capable of satisfying a given client
ter submitting the query described in Subsection III-A. In-                     request. The proposed system is the first one – at the best
deed, the results panel lists the two successful composi-                       of our knowledge – that addresses three main issues of
tions which satisfy the example query. Note the result syn-                     the semantic service discovery domain, namely, composition-
tax [S1 (P11 , . . . , P1n1 ), . . . , Sn (Pn1 , . . . , Pnnn )] where Si de-   oriented discovery, crossing different ontologies and efficiency.
notes the service name and Pi1 , . . . , Pini denote the atomic                 We have also presented a first Web-accessible prototype of our
processes of the selected profile of Si .                                       discovery system.
                                                                                   Our plan for future work includes the development of fresh
                         V. R ELATED WORK                                       indexing and/or ranking techniques (as search engines do
   The discovery of semantic Web services has been firstly                      for Web pages) in order to sensibly improve the efficiency
addressed by Paolucci et al. in [17], where they proposed an                    of our system. A second line of our future work is to
algorithm performing a functionality matching between ser-                      enhance our discovery system by employing a behavioural
vice requests and service advertisements described as DAML-                     analyser module, which analyses the behavioural information
S (the predecessor of OWL-S) service profiles. Yet, [17] does                   advertised in the (OWL-S) service descriptions to determine
not deal with different ontologies, nor it considers service                    whether the candidate services (selected by the functional
composition, which instead is the goal of [3], [12], [8].                       analyser) can really be composed together and satisfy the
These approaches discover compositions of OWL-S described                       query without dead-locking, in the line of [5]. Finally, we plan
services by operating in the domain of hypergraphs, finite state                to extend SemFiT by employing other existing matching tools
automata and interface automata, respectively. However, [3],                    to achieve better results for the ontology matching problem.
[12], [8] do not address the task of crossing ontologies.                       Our long-term goal is to develop a well-founded methodology
   In [2] Aversano et al. extended [17] by proposing a                          to support an efficient and fully-automated discovery and
composition-oriented discovery algorithm capable of coping                      composition of Web services.
                                                            Fig. 5.   The Web interface of SAM.


                              R EFERENCES                                        [12] S. B. Mokhtar, N. Georgantas, and V. Issarny, “Ad Hoc Composition
                                                                                      of User Tasks in Pervasive Computing Environment,” in Software
 [1] J. F. Aldana-Montes, I. Navas-Delgado, and M. del Mar Roldan-Garcia,             Composition, LNCS 3628, T. Gschwind, U. Aßmann, and O. Nierstrasz,
     “Solving Queries over Semantically Integrated Biological Data Sources,”          Eds. Springer-Verlag, 2005.
     in Int. Conf. on Web-Age Information Management. LNCS 3129, 2004.           [13] S. B. Mokhtar, A. Kaul, N. Georgantas, and V. Issarny, “Towards Effi-
 [2] L. Aversano, G. Canfora, and A. Ciampi, “An Algorithm for Web                    cient Matching of Semantic Web Service Capabilities,” in Proceedings
     Service Discovery through Their Composition,” in IEEE International              of WS-MATE 2006, 2006.
     Conference on Web Services (ICWS’04), L. Zhang, Ed. IEEE Computer           [14] I. Navas-Delgado, I. Sanz, J. F. Aldana-Montes, and R. Berlanga,
     Society, 2004, pp. 332–341.                                                      “Automatic Generation of Semantic Fields for Resource Discovery in
 [3] B. Benatallah, M.-S. Hacid, C. Rey, and F. Toumani, “Request                     the Semantic Web,” in 16th Int. Conf. on Database and Expert Systems
     Rewriting-Based Web Service Discovery,” in The Semantic Web - ISWC               Applications. LNCS 3588, 2005.
     2003, LNCS 2870, G. Goos, J. Hartmanis, and J. van Leeuwen, Eds.            [15] N. Noy and M. Musen, “Prompt: Algorithm and Tool for Automated
     Springer-Verlag, 2003, pp. 242–257.                                              Ontology Merging and Alignment,” in Proc. of the Nat. Conf. on
 [4] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” in               Artificial Intelligence, 2000.
     Scientific American, 2001.                                                  [16] OWL-S Coalition, “OWL-S 1.1,” 2004, http://www.daml.org/services/
 [5] A. Brogi and S. Corfini, “Behaviour-aware discovery of Web service               owl-s/1.1/.
     compositions,” in University of Pisa, Department of Computer Science        [17] M. Paolucci, T. Kawamura, T. Payne, and K. Sycara, “Semantic Match-
     - Tech. Rep. TR-06-08, June 2006.                                                making of Web Services Capabilities,” in 1th Int. Conf. on The Semantic
 [6] A. Brogi, S. Corfini, J. Aldana, and I. Navas, “Automated Discovery              Web, LNCS 2342, I. Horrocks and J. Hendler, Eds. Springer-Verlag,
     of Compositions of Services Described with Separate Ontologies,” in              2002, pp. 333–347.
     ICSOC 2006. LNCS 4294, A. Dan and W. Lamersdorf, Eds. Springer-             [18] R. Al-Halimi et al., WordNet - An Electronic Lexical Database, C. Fell-
     Verlag, 2006, pp. 509–514.                                                       baum, Ed. MIT Press, 1998.
 [7] G. Gallo, G. Longo, S. Nguyen, and S. Pallottino, “Directed hypergraphs     [19] P. Rajasekaran, J. A. Miller, K. Verma, and A. P. Sheth, “Enhancing
     and applications,” Discrete Applied Mathematics, vol. 42, no. 2, pp. 177–        Web Services Description and Discovery to Facilitate Composition,”
     201, 1993.                                                                       in Semantic Web Services and Web Process Composition, LNCS 3387,
 [8] S. Hashemian and F. Mavaddat, “A Graph-Based Approach to Web                     J. Cardoso and A. Sheth, Eds. Springer-Verlag, 2005, pp. 55–68.
     Services Composition,” in SAINT 2005, IEEE Computer Society, Ed.            [20] G. Stumme and A. Madche, “Fca-merge: Bottom-up Merging of Ontolo-
     CS Press, 2005, pp. 183–189.                                                     gies,” in In 7th Intl. Conf. on Artificial Intelligence (IJCAI ’01), 2001,
 [9] D. Hong-Hai and R. Herhard, “COMA - A System for Flexible Combi-                 pp. 225–230.
     nation of Schema Matching Approches,” in Proc. of the 28th Int. Conf.       [21] W3C, “The UDDI Technical White Paper,” 2000, http://www.uddi.org/.
     on Very Large Databases (VLDB), Hongkong, 2002.                             [22] ——, “Simple Object Access Protocol (SOAP) 1.2, W3C working draft,
[10] A. Maedche and S. Staab, “Measuring Similarity between Ontologies,”              17 December 2001,” 2001, http://www.w3.org/TR/2001/WD-soap12-
     in Proc. Of the European Conference on Knowledge Acquisition and                 part0-20011217/.
     Management - EKAW-2002, LNCS/LNAI 2473, 2002, pp. 251–263.                  [23] ——, “Web Service Description Language (WSDL) 1.1,” 2001,
[11] D. McGuinness, R. Fikes, J. Rice, and Wilder, “The Chimaera Ontology             http://www.w3.org/TR/wsdl.
     Environment,” in Proc. of the 17th Nat. Conf. on Artificial Intelligence,
     2000, pp. 1123–1124.