<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Path-Based Semantic Annotation for Web Service Discovery</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Julius K¨opke</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dominik Joham</string-name>
          <email>dominik.joham@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johann Eder</string-name>
          <email>johann.ederg@aau.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Informatics-Systems, Alpen-Adria-Universita ̈t Klagenfurt</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <fpage>81</fpage>
      <lpage>88</lpage>
      <abstract>
        <p>Annotation paths are a new method for semantic annotation which overcomes the limited expressiveness of semantic annotations by concept references as defined in the SAWSDL standard. In this work we show some preliminary evaluation of the feasibility of annotation paths for web service discovery. The experiments suggest that annotation paths can capture the semantics of XML schemas and web service descriptions more precisely and appears as a promising approach for improving web service discovery.</p>
      </abstract>
      <kwd-group>
        <kwd>Web Service Matching</kwd>
        <kwd>Service Discovery</kwd>
        <kwd>Semantic Annotation</kwd>
        <kwd>SAWSDL</kwd>
        <kwd>Annotation Paths</kwd>
        <kwd>XML-Schema Matching</kwd>
        <kwd>Semantic Matching</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Web service discovery aims at (semi-)automating the search for suitable web
services. A web service discovery systems accepts a service request (a specification
of the needed web service) and a set of web service descriptions (advertisements)
as input and returns a list of web service descriptions ranked by relevance for
the request. There are many different approaches ranging from the structural or
lexical comparison of requests and advertisements to approaches that are based
on the explicit definition of the semantics using ontologies [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. We specifically
address the usage of external knowledge provided by semantic annotations with
a reference ontology using SAWSDL [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] annotations. The W3C
recommendation SWASDL (Semantic Annotations for WSDL and XML-Schema) specifies
a light-weight approach for the annotation of web services with arbitrary
semantic models (e.g. ontologies). SAWSDL introduces additional attributes for
XML-Schema and WSDL-documents. ModelReferences refer to ontology
concepts and Lifting- and Lowering-Mappings refer to arbitrary scripts that
transform the inputs and output XML-data to and from instances of some semantic
model. M odelRef erences are proposed for service discovery, while Lifting- and
Lowering-Mappings are proposed for service invocation and only apply to the
annotation of inputs and outputs defined by XML-Schema.
      </p>
      <p>
        Our previous work focused on the annotation of XML-schemas with reference
ontologies in order to automate the generation of executable schema mappings
for document transformations [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">8–10, 7</xref>
        ]. We could show that the expressiveness of
SAWSDL is not sufficient for the generation of schema mappings when general
reference ontologies are directly used for the annotation. Therefore, we have
proposed an extended annotation method that is based on annotation paths
rather than single concept annotations. Since this method already showed its
usefulness for XML-document transformations [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] we assume that annotation
paths can also improve web service discovery. The general hypothesis is that if
the annotation method allows a more precise definition of the semantics then
the precision of service matching for service discovery can be improved. Existing
approaches for SAWSDL based service discovery such as [
        <xref ref-type="bibr" rid="ref3 ref5">5, 3</xref>
        ] can partly solve
the problem of non precise semantic annotations by using additional dimensions
such as structure or textual similarity.
      </p>
      <p>To give a first answer on this hypothesis we discusses the usage of annotation
paths for web service discovery and report some preliminary results.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Annotation Path Method</title>
      <p>
        In some examples we show limitations of simple references to concepts for the
annotation of arbitrary XML-schemas or web service descriptions with existing
reference ontologies. We then present the general concept of annotation paths
(for details we refer to [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]).
2.1
      </p>
      <sec id="sec-2-1">
        <title>Example</title>
        <p>
          The SAWSDL [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] standard addresses semantic annotations for both web service
descriptions and for XML-schemas which are related since WSDL description
use XML-Schema to define the inputs and outputs of operations. The
XMLSchema document shown in Figure 2 is annotated using simple concept references
referring to the ontology shown in Figure 1 using the sawsdl : M odelRef erence
attribute.
        </p>
        <p>The annotated document in Figure 1 exhibits the following problems:
{ The elements BuyerZipcode and BuyerStreet cannot be annotated because
the zip-code is modeled in form of a data-type property and not by a concept
in the ontology.
{ The BuyerCountry element is annotated with the concept country. This
does not fully express the semantics because we do not know that the
element should contain the country of the buying-party. In addition the
SellerCountry element has exactly the same annotation and can therefore not be
distinguished.
{ The attribute Price is annotated with the concept Price. Unfortunately this
does not capture the semantics. We do not know the subject of the price (an
item) and we do not know the currency.</p>
        <p>We have always used exactly one concept for the annotation in the example.
However, SAWSDL supports lists of concepts in the modelRef erence attribute
but it does not allow to specify the relations between the concepts in this list.
Therefore, this does not help to solve the shown problems. In the examples
above we have only annotated data-carrying elements. If we would in addition
also annotate the parent elements in this case the order element we could add
a bit more semantic information. It would be clear that the annotations of the
child-elements of the order-element can be seen in the context of an order.
Unfortunately this would not help for the ambiguities between the BuyerCountry
and the SellerCountry element. In general it would require a strong structural
relatedness between the ontology and the annotated XML-Schema or service
description which we cannot guarantee when many different schemas or
services are annotated with a single reference ontology. In addition SAWSDL does
not define that there are any relations between the annotations of parent and
child elements. A solution for these non precise annotations is the usage of a
more specific reference ontology, which contains concepts that fully match the
semantics of each annotated element. For example it would need to contain the
concept InvoiceBuyerCountry and InvoiceBuyerZipCode. However, enhancing a
general reference ontology with all possible combinations of concepts leads to a
combinatorial explosion.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Annotation Path Method</title>
        <p>
          We propose a new annotation method based on annotation path expressions
that are sequences of steps referring to concepts and properties of a reference
ontology. The first step of an annotation path is always a concept. The last
step of an annotation path can be a concept or a data-type property. Between
two concept steps there is always an object property step. Concept steps can
have constraints denoted in square brackets. In Fig. 3 we give some examples
for annotation paths - more details can be found in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>Annotation paths can automatically be represented in form of OWL2
concepts which can be used to extend the reference ontology. For example the
path Order/billTo/Buyer[Mr Smith]/hasCountry/Country is represented as a
subclass of Country that has an inverse hasCountry relation to a Buyer whose
name is M r: Smith who has an inverse billTo relation to an Order. This can be
represented by the OWL expression Country and inv (hasCountry) some (Buyer
and fMr Smithg and inv (billTo) some (Order)).</p>
        <p>The extraction of annotations from a schema requires to rewrite the schema
in order to cope with reused elements first. The resulting schema may contain
additional annotations. Since schema elements can refer to other schema elements
and types, the full annotation path has to be concatenated from the annotation
paths of the elements. For an example an XML-element DeliveryAddress is itself
annotated with the annotation path /Order/deliverTo/Address. It has a type
definition address. The address type itself contains various elements. One of them
is street which is annotated with /Address/hasStreet. In order to construct the
complete semantics of the street element which is a child element of
DeliveryAddress we get the additional path /Order/deliverTo/Address/hasStreet.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Path-Based Service Matching Prototype</title>
      <p>
        To apply the annotation path method to web service discovery we implemented
a logics based service matcher [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] that operates only on path-based annotations
of the inputs and outputs of operations. No other dimensions of the service
descriptions are used for matching. The assumption is that two operations with
the same inputs and outputs are likely to be the same operation. We do not
address the annotation of operations themselves. In order to rank different web
services according to a request we automatically generate one XML-Schema
for the inputs and one XML-Schema for the outputs of each operation of the
advertisements and the request. These schemas are then matched and an overall
confidence value for the service match in the interval [0..1] is computed. The
ranking is then based on the confidence values. The matching process of the
schemas operates in 4 phases:
{ Annotation Path Extraction: The input and output schemas of each
operation are transformed to an internal tree representation where no types are
reused using the COMA3[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] library. The annotation paths are rewritten as
described in the last paragraph of Sect. 2.2. Finally all annotation paths are
extracted from the resulting tree.
{ Extended Ontology Generation: The annotations are transformed to OWL
concepts and an extended reference ontology is created.
{ Matching and Mapping: The XML-Schemas of the request and of each
advertisement are matched based on the annotations using a standard OWL
reasoner (pellet). Two schema elements s1 from the source schema and t1
from the target schema match if the annotation concept (the
corresponding annotation path represented as an OWL concept) of s1 is equivalent
to the annotation concept of t1 or if there is a subclass or superclass
relation between s1 and t1. In case of equivalence the confidence value of the
match is 1. In case of the subclass match the confidence value of the match
is 0:8 weighted by the concept distance between the annotation concept of
s1 and t1 in the extended reference ontology. In case of a superclass to
subclass match the confidence value is 0:6 also weighted by the distance in the
ontology. After the confidence values are computed for each combination of
elements of the source and target schema, a schema mapping is created based
on the best matching elements.
{ Ranking: Finally, an overall confidence value of each schema mapping is
computed by aggregating the confidence values of the mapping elements using
min, max or avg. strategies and the advertisements are ordered descending
by the overall confidence values.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>
        The goal of the evaluation is to provide preliminary results whether the
annotation path method leads to better results in service discovery. Therefore, we
have evaluated [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] our service matcher that exploits only path based semantic
annotations against existing SAWSDL-based service matchers. The assumption
is that when this simple service matcher can compete with state of the art service
matchers that exploit far more aspects of a service and use advanced techniques
such as machine learning, then the usage of annotation paths is also promising
for service discovery.
      </p>
      <p>We have annotated a subset of the SAWSDL-TC31 data-set with our
annotation path method and have evaluated our matcher against service matchers that
took part in the International Semantic Service Selection Contests 2. We have
evaluated two scenarios:
{ Scenario 1: The goal of this scenario was to evaluate how, our simple matcher
can compete against current state of the art matchers based on existing
requests and advertisements of the SAWSDL-TC-3 data-set. In the first
scenario we have selected one arbitrary request (book-price) and 40
advertisements and have annotated them manually using the annotation path method.
Our matcher operated on requests and advertisements which are annotated
with annotation-paths and the reference matchers used the original
annotations and advertisements of the TC-3 data-set.
{ Scenario 2: The goal of the second scenario was to asses how our matcher
competes against other matchers if the semantics cannot be expressed by
simple concept annotations. In this case matchers operating on simple
concept annotations can only infer the missing semantics by exploiting other
dimensions such as the structure or naming of elements. The second
scenario was also evaluated using existing advertisements of the SAWSDL-TC3
data-set. We have only changed the request. We now require for the price of
books in Euro but excluding tax and we restrict the input to science fiction
comics. This cannot be expressed with the used ontologies because no such
concepts exists. However, a hint for the standard SASWSDL matchers was
provided by the requested output type (EuroPriceExcludingVAT ) and input
type (ScienceFictionComic).</p>
      <sec id="sec-4-1">
        <title>1 http://projects.semwebcentral.org/projects/sawsdl-tc/ 2 http://www-ags.dfki.uni-sb.de/ klusch/s3/index.html</title>
        <p>
          We have executed the evaluation with the Service Matchmaker and Execution
Environment (SME23) which is also used for the International Semantic Service
Selection Contests. Due to the partial TC3 data-set we were not able to execute
all matchers. However, we could execute two major representatives iSem[
          <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
          ]
and SAWSDL-MX[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. The iSem matcher when applied for SAWSDL is a hybrid
service matcher exploiting inputs and outputs and service names that employs
strict and approximated logical matching, text-similarity-based matching and
structural matching and automatically adjusts its aggregation and ranking
parameters using machine learning. It reached the best binary precision in the
contest of 2012. SAWSDL-MX is a typical representative of a hybrid matcher
using logics and syntax-based matching. The SAWSDL-TC3 data-set is
annotated with relevance grades for each combination of advertisements and requests.
A relevance grade is a value between 0 and 3, where 0 stands for not relevant and
3 stands for highly relevant. We used the original relevance grades for Scenario 1
and asked an independent expert to provide the relevance grades for Scenario 2.
We have assessed the overall performance of each matcher based on the reached
Normalized Discounted Cumulative Gain[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] (NDCG) which is also used in the
International Semantic Service Selection Contests. The results of both scenarios
are shown in Table 1. Our matcher performed more than 4 percent better than
the SAWSDL-MX matcher and around 1 percent less precise than the nearly
perfect iSEM matcher. In the second scenario our matcher performed around 9
percent better than iSem and around 12 percent better than SAWSDL-MX.
        </p>
        <p>While these preliminary results do not yet allow to draw final conclusions
the annotation paths approach is promising for improving web service discovery.
Our simple path-based matcher could clearly show its advantage in Scenario 2
and in Scenario 1 it could compete well with existing state of the art matchers
which use far more advanced matching methods and additional aspects of service
descriptions and advertisements.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>The annotation path method for semantic annotation has been developed to
overcome limitations in the expressiveness of simple concept references. We
showed in some feasibility tests that already a simple implementation of an</p>
      <sec id="sec-5-1">
        <title>3 http://projects.semwebcentral.org/projects/sme2/</title>
        <p>annotation path based XML-schema matcher used for comparing web service
advertisements with service requests can successfully compete with state-of the
art web service discovery systems. We therefore conclude that annotation paths
are well suited for capturing the semantics of objects in much finer detail and
that the annotation path method and matchers based on it are promising
approaches to improve web service discovery. Encouraged by the promising results
we plan to evaluate our annotation path based matcher with a larger data-set
and against additional existing matchers. Other future work is to integrate our
matcher into existing state of the art matchers to gain even better results.
Another direction of future work is to evaluate not only the matching precision but
also the minimum amount of manual work to semi-automatically create
annotation path annotations in comparison to simple concept references.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Dominik</given-names>
            <surname>Joham</surname>
          </string-name>
          .
          <article-title>Path-based semantic annotation of web service descriptions for improved web service discovery</article-title>
          .
          <source>Master-thesis</source>
          , AAU Klagenfurt,
          <year>February 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Jaana</given-names>
            <surname>Kek</surname>
          </string-name>
          <article-title>¨ala¨inen. Binary and graded relevance in IR evaluations-Comparison of the effects on ranking of IR systems</article-title>
          .
          <source>INFORM PROCESS MANAG</source>
          ,
          <volume>41</volume>
          (
          <issue>5</issue>
          ):
          <fpage>1019</fpage>
          -
          <lpage>1033</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Klusch</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Kapahnke</surname>
          </string-name>
          . isem:
          <article-title>Approximated reasoning for adaptive hybrid selection of semantic services</article-title>
          .
          <source>In Semantic Computing (ICSC)</source>
          ,
          <source>2010 IEEE Fourth International Conference on</source>
          , pages
          <fpage>184</fpage>
          -
          <lpage>191</lpage>
          ,
          <year>Sept 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Klusch</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Kapahnke</surname>
          </string-name>
          .
          <article-title>The isem matchmaker: A flexible approach for adaptive hybrid semantic service selection</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          ,
          <volume>15</volume>
          (
          <issue>3</issue>
          ),
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Klusch</surname>
          </string-name>
          , Patrick Kapahnke, and
          <string-name>
            <given-names>Ingo</given-names>
            <surname>Zinnikus</surname>
          </string-name>
          .
          <article-title>Hybrid adaptive web service selection with sawsdl-mx and wsdl-analyzer</article-title>
          .
          <source>In The Semantic Web: Research and Applications</source>
          , volume
          <volume>5554</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>550</fpage>
          -
          <lpage>564</lpage>
          . Springer Berlin Heidelberg,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Jacek Kopecky´,
          <string-name>
            <surname>Tomas</surname>
            <given-names>Vitvar</given-names>
          </string-name>
          , Carine Bournez, and
          <string-name>
            <given-names>Joel</given-names>
            <surname>Farrell</surname>
          </string-name>
          . Sawsdl:
          <article-title>Semantic annotations for wsdl and xml schema</article-title>
          .
          <source>IEEE Internet Comput.</source>
          ,
          <volume>11</volume>
          (
          <issue>6</issue>
          ):
          <fpage>60</fpage>
          -
          <lpage>67</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Julius</surname>
            <given-names>K</given-names>
          </string-name>
          <article-title>¨opke</article-title>
          .
          <article-title>Declarative Semantic Annotations for XML Document Transformations and their Maintenance</article-title>
          .
          <source>Phd-thesis</source>
          , AAU Klagenfurt,
          <year>March 2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Julius</given-names>
            <surname>Ko</surname>
          </string-name>
          <article-title>¨pke and Johann Eder. Semantic annotation of xml-schema for document transformations</article-title>
          .
          <source>In Proc. of OTM'10 Workshops</source>
          , volume
          <volume>6428</volume>
          <source>of LNCS</source>
          , pages
          <fpage>219</fpage>
          -
          <lpage>228</lpage>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Julius</surname>
            <given-names>K</given-names>
          </string-name>
          <article-title>¨opke</article-title>
          and
          <string-name>
            <given-names>Johann</given-names>
            <surname>Eder</surname>
          </string-name>
          .
          <article-title>Semantic invalidation of annotations due to ontology evolution</article-title>
          .
          <source>In Proc. 2011 of OTM'</source>
          <year>2011</year>
          , volume
          <volume>7045</volume>
          <source>of LNCS</source>
          , pages
          <fpage>763</fpage>
          -
          <lpage>780</lpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Julius</surname>
          </string-name>
          <article-title>Ko¨pke and Johann Eder. Logical invalidations of semantic annotations</article-title>
          .
          <source>In Proc. of CAiSE'12</source>
          , volume
          <volume>7328</volume>
          <source>of LNCS</source>
          , pages
          <fpage>144</fpage>
          -
          <lpage>159</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Sabine</surname>
            <given-names>Maßmann</given-names>
          </string-name>
          , Salvatore Raunich, David Aumu¨ller, Patrick Arnold, and
          <string-name>
            <given-names>Erhard</given-names>
            <surname>Rahm</surname>
          </string-name>
          .
          <article-title>Evolution of the coma match system</article-title>
          .
          <source>In OM</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Debajyoti</given-names>
            <surname>Mukhopadhyay</surname>
          </string-name>
          and
          <string-name>
            <given-names>Archana</given-names>
            <surname>Chougule</surname>
          </string-name>
          .
          <article-title>A survey on web service discovery approaches</article-title>
          . In David C. Wyld, Jan Zizka, and Dhinaharan Nagamalai, editors,
          <source>Advances in Computer Science</source>
          , Engineering &amp; Applications, volume
          <volume>166</volume>
          <source>of Advances in Intelligent and Soft Computing</source>
          , pages
          <fpage>1001</fpage>
          -
          <lpage>1012</lpage>
          . Springer Berlin Heidelberg,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>