Evaluating Semantic Web Service Tools using the
                                                   SEALS platform

                                                                Liliana Cabral1, Ioan Toma2
                                          1
                                              Knowledge Media Institute, The Open University, Milton Keynes, UK
                                                      2
                                                        STI Innsbruck, University of Innsbruck, Austria

                                                          l.s.cabral@open.ac.uk, ioan.toma@sti2.at


                                    Abstract. This paper describes the approach for the automatic evaluation of
                                    Semantic Web Service (SWS) tools, based on the infrastructure under
                                    development within the SEALS project. We describe the design of evaluations,
                                    considering existing test suites as well as repository management and
                                    evaluation measure services that will enable evaluation campaign organizers
                                    and participants to evaluate SWS tools. Currently, we focus on the SWS
                                    discovery activity, which consists of finding Web Services based on their
                                    semantic descriptions. Tools for SWS discovery or matchmaking can be
                                    evaluated on retrieval performance, where for a given goal, i.e. a semantic
                                    description of a service request, and a given set of service descriptions, i.e.
                                    semantic descriptions of service offers, the tool returns the match degree
                                    between the goal and each service, and the platform measures the rate of
                                    matching correctness based on a number of metrics.
                                    Keywords: Semantic Web Services, automatic evaluation.


                            1 Introduction

                            The evaluation of Semantic Web Services is currently being pursued by a few
                            initiatives using different evaluation methods. Although these initiatives have
                            succeeded in creating an initial evaluation community in this area, they have been
                            hindered by the difficulties in creating large-scale test suites and by the complexity of
                            manual testing to be done. In principle, it is very important to create test datasets
                            where semantics play a major role for solving problem scenarios; otherwise
                            comparison with non-semantic systems will not be significant, and in general it will
                            be very difficult to measure tools or approaches based purely on the value of
                            semantics. Therefore, providing an infrastructure for the evaluation of SWS that
                            supports the creation and sharing of evaluation artifacts and services, making them
                            widely available and registered according to problem scenarios, using agreed
                            terminology, can benefit evaluation participants and organizers.
                               In this paper we describe the approach for the automatic evaluation of Semantic
                            Web Services using the SEALS platform, that is, the services of the SEALS platform


Proceedings of the International Workshop on Evaluation of Semantic Technologies (IWEST 2010). Shanghai, China. November 8, 2010.
for SWS tools. The SEALS (Semantic Evaluation at Large Scale) project1 aims to
create a lasting reference infrastructure for semantic technology evaluation (the
SEALS platform). The SEALS Platform will be an independent, open, scalable,
extensible and sustainable infrastructure that will allow online evaluation of semantic
technologies by providing an integrated set of evaluation services and a large
collection of datasets. Semantic Web Services are one of the technologies which are
supported by SEALS. By the time of this writing, the SEALS project completed 15
months of its 36-month duration. Hence, the results presented in this paper are
ongoing and under testing. An overview of existing SWS approaches, matchmaking
algorithms or evaluation measures is not in the scope of this paper.
   Within SEALS, we propose an approach that is informed by and improves existing
SWS tool evaluation initiatives (Section 2.1). In this sense, our approach shares the
goals and objectives of these initiatives. We describe the design of evaluations
(Section 3), considering existing test suites (Section 2.2) as well as repository
management and evaluation measure services (Section 4) that will enable evaluation
campaign organizers and participants to evaluate SWS tools. Currently, we focus on
the SWS discovery activity, which consists of finding Web Services based on their
semantic descriptions. Tools for SWS discovery or matchmaking can be evaluated on
retrieval performance, where for a given goal, i.e. a semantic description of a service
request, and a given set of service descriptions, i.e. semantic descriptions of service
offers, the tool returns the match degree between the goal and each service, and the
platform measures the rate of matching correctness based on a number of metrics.
   The evaluation of SWS tools uses metadata, described via ontologies, about the
evaluation scenario, tools, testdata and results stored in repositories. The evaluation
scenario metadata informs which test suites and tools participate in a specific
evaluation event, and provides the evaluation workflow. The testdata metadata
(Section 3.1) informs how the testdata is structured for consumption. More
specifically, the metadata for SWS discovery test suites describes the set of service
descriptions, the list of goals and the reference sets (expert's relevance values between
a goal and a service). In addition, the evaluation of SWS tools produces two types of
results: raw results, which are generated by running a tool with specific testdata; and
interpretations, which are the results obtained after the evaluation measures are
applied over the raw results. The format of the results (Section 3.1) is also described
via ontologies.
   In Section 5, we describe the interface of the SWS plugin API, which must be
implemented for participating tools. In Section 6 we describe the evaluation workflow
as well as examples of tools and available test suites for SWS discovery. The
workflow performs access to the SEALS repositories and services. In the first phase
of the project this workflow will be available to the evaluation campaign organizers
only and executed over tools registered in a specific SWS evaluation campaign
scenario. Finally in Section 7 we describe our conclusions and future work.


1 http://www.seals-project.eu/
2 Related Work

Semantic Web Service (SWS) technologies enable the automation of discovery,
selection, composition, mediation and execution of Web Services by means of
semantic descriptions of their interfaces, capabilities and non-functional properties.
SWS build on Web service standards such as WSDL, SOAP and REST (HTTP), and
as such provide a layer of semantics for service interoperability. Current results of
SWS research and industry efforts include a number of reference service ontologies
(e.g. OWL-S, WSMO, WSMO-Lite) and semantic annotation extension mechanisms
(e.g. SAWSDL, SA-REST, MicroWSMO).
   The work performed in SEALS regarding SWS tools will be based upon the
Semantic Web Service standardization effort that is currently ongoing within the
OASIS Semantic Execution Environment Technical Committee (SEE-TC)2. A
Semantic Execution Environment (SEE) is made up of a collection of components
that are at the core of a Semantic Service Oriented Architecture (SOA). These
components provide the means for automating many of the activities associated with
the use of Web Services, thus they will form the basis for creating the SWS plugin
APIs and services for SWS tools evaluation.


2.1 Existing SWS Evaluation Initiatives

In the following we provide information, extracted from the respective websites,
about three current SWS evaluation initiatives: the SWS Challenge; the S3 Contest;
and the WS Challenge (WSC).
    The SWS Challenge3 (SWSC) aims at providing a forum for discussion of SWS
approaches based on a common application base. The approach is to provide a set of
problems that participants solve in a series of workshops. In each workshop,
participants self-select which scenario (e.g. discovery, mediation or invocation) and
problems they would like to solve. Solutions to the scenarios provided by the
participants are manually verified by the Challenge organising committee. The
evaluation is based on the level of effort of the software engineering technique. That
is, given that a certain tool can solve correctly a problem scenario, the tool is certified
on the basis of being able to solve different levels of the problem space. In each level,
different inputs are given that requires a change in the provided semantics. A report
on the methodology for the SWSC has been published in the W3C SWS Testbed
Incubator4. One of the important goals of the SWSC is to develop a common
understanding of the various technologies evaluated in the workshops. So far, the
approaches range from conventional programming techniques with purely implicit
semantics, to software engineering techniques for modelling the domain in order to
more easily develop application, to partial use of restricted logics, to full semantics
annotation of the web services.


2 www.oasis-open.org/committees/ex-semantics
3 http://sws-challenge.org
4 http://www.w3.org/2005/Incubator/swsc/XGR-SWSC-20080331
   The Semantic Service Selection (S3) contest5 is about the retrieval performance
evaluation of matchmakers for Semantic Web Services. S3 is a virtual and
independent contest, which runs annually since 2007. It provides the means and a
forum for the joint and comparative evaluation of publicly available Semantic Web
service matchmakers over given public test collections. S3 features three tracks:
OWL-S matchmaker evaluation (over OWLS-TC); SAWSDL matchmaker evaluation
(over SAWSDL-TC); cross evaluation (using JGD6 collection). The participation in
the S3 contest consists of: a) implementing the SME27 plug-in API for the
participant’s matchmaker together with an XML file specifying additional
information about the matchmaker; and b) using the SME2 evaluation tool for testing
the retrieval performance of the participant’s matchmaker over a given test collection.
This tool has a number of metrics available and provides comparison results in
graphical format. The presentation and open discussion of the results with the
participants is performed by someone from the organisational board at some event
like the SMR2 workshop (Service Matchmaking and Resource Retrieval in the
Semantic Web).
   The Web Service Challenge8 (WSC) runs annually since 2005 and provides a
platform for researchers in the area of web service composition that allows them to
compare their systems and exchange experiences. Starting from the 2008 competition,
the data formats and the contest data are based on the OWL for ontologies, WSDL for
services, and WSBPEL for service orchestrations. In 2009, services were annotated
with non-functional properties. The Quality of Service of a Web Service is expressed
by values expressing its response time and throughput. The WSC awards the most
efficient system and also the best architectural solution. The contestants should find
the composition with the least response time and the highest possible throughput.
WSC uses the OWL format, but semantic evaluation is strictly limited to taxonomies
consisting of sub and super class relationship between semantic concepts only.
Semantic individuals are used to annotate input and output parameters of services.
Four challenge sets are provided and each composition system can achieve up to 18
points and no less than 0 points per challenge set. Three challenge sets will have at
least one feasible solution and one challenge set will have no solution at all.


2.2 Existing SWS Test Collections

The OWL-S Test Collection (OWLS-TC)9 is intended to be used for evaluation of
OWL-S matchmaking algorithms. OWLS-TC is used worldwide (it is among the top-
10 download favourites of semwebcentral.org) and the de-facto standard test
collection so far. It has been initially developed at DFKI, Germany, but later corrected
and extended with the contribution of many people from a number of other
institutions (including e.g. universities of Jena, Stanford and Shanghai, and FORTH).
The OWLS-TC4 version consists of 1083 semantic web services described with

5 http://www-ags.dfki.uni-sb.de/~klusch/s3/index.html
6 http://fusion.cs.uni-jena.de/professur/jgd
7 http://www.semwebcentral.org/projects/sme2/
8 http://ws-challenge.georgetown.edu/wsc09/technical_details.html
9 http://projects.semwebcentral.org/projects/owls-tc/
OWL-S 1.1, covering nine application domains (education, medical care, food, travel,
communication, economy, weapons, geography and simulation). OWLS-TC4
provides 42 test queries associated with binary as well as graded relevance sets. The
relevance sets were created with the SWSRAT (Semantic Web Service Relevance
Assessment Tool) developed at DFKI. The graded relevance is based on a scale using
4 values: highly relevant (value: 3); relevant (value: 2); potentially relevant (value:
1); and non-relevant (value: 0). 160 services and 18 queries contain Precondition
and/or Effect as part of their service descriptions.
    The SAWSDL Test Collection (SAWSDL-TC) is a counterpart of OWLS-TC, that
is, it has been semi-automatically derived from OWLS-TC. SAWSDL-TC is intended
to support the evaluation of the performance of SAWSDL service matchmaking
algorithms. The SAWSDL-TC3 version provides 1080 semantic Web services written
in SAWSDL (for WSDL 1.1) and 42 test queries with associated relevance sets.
Model references point to concepts described in OWL2-DL exclusively.
    The Jena Geography Dataset (JGD)10 is a test collection of about 200 geography
services that have been gathered from web sites like seekda.com, xmethods.com,
webservicelist.com, programmableweb.com, and geonames.org. JGD is available via
the OPOSSum Portal11. The services are described using natural language. In
addtition, the input and output parameter types have been manually linked to
WordNet sense keys. The portal can also store ontologies to which service
descriptions can refer. It is worth noting that the JGD collection has been used to
support evaluations across formalisms. Semantic descriptions (including OWL-S and
SAWSDL) for subsets of JGD have been created for evaluations within the context of
the JGD cross-evaluation track at the S3 Contest.


3 SWS Tools Evaluation Design

Following on the SEALS infrastructure, SWS evaluation descriptions, test data, tool
and evaluation results (including metadata) will be stored in respective repositories
and used by the SEALS platform. The SEALS platform will basically run evaluations
registered in the Evaluation Descriptions Repository. An Evaluation Description
refers to the test data and tools participating in a specific evaluation scenario and
provides the evaluation workflow for this scenario. Evaluation measures will be
available as services, which can be used within evaluation workflows. The SEALS
platform will also execute the SWS tool plugin, which must be implemented by tool
providers (evaluation campaign participants).
   We can summarize the goals of SWS tool evaluation as below:
    Provide a platform for the joint and comparative evaluation of publicly
        available Semantic Web service tools over public test collections.
    Provide a forum for discussion of SWS approaches.
    Provide a common understanding of the various SWS technologies.
    Award tools as a result of solving common problems.

10 http://fusion.cs.uni-jena.de/professur/jgd
11 http://fusion.cs.uni-jena.de/OPOSSum/
      Improve programmer productivity and system effectiveness by making
       semantics declarative and machine-readable.

   We are interested in evaluating performance and scalability as well as solution
correctness of application problems. We comment below on how we consider several
evaluation criteria for SWS tools.
    Performance - This is specific to the type of SWS activity. For retrieval
       performance in discovery activities (i.e. service matchmaking), measures such
       as Precision and Recall are usually used. More generic performance measures
       are execution time and throughput.
    Scalability - Scalability of SWS tools are associated with the ability to perform
       an activity (e.g. discovery) involving an increasing amount of service
       descriptions. This can be measured together with performance (above),
       however, this is also related to the scalability of repositories.
    Correctness - This is related to the ability of a tool to respond correctly to
       different inputs or changes in the application problem by changing the
       semantic descriptions. This criterion is related to mediation and invocation of
       SWS. Messages resulting from the invocation or interaction of services should
       be checked against a reference set.
    Conformance - We are not concerned with measuring the conformance of a
       tool to a predefined standard. Instead, we will use a reference SWS
       architecture in order to define a SWS plugin API and a measurement API.
    Interoperability - As we are interested in evaluating SWS usage activities
       instead of the interchange of SWS descriptions, we are not concerned with
       measuring interoperability between tools.
    Usability - Although it might be useful to know which SWS tools have an easy
       to-use user interface or development environment, we consider that at this
       point in time due to the few number of front-ends for SWS development, a
       comparison would be more easily done using feedback forms. Therefore, we
       will not be concerned with measuring usability of SWS tools.


3.1 Metadata
In SEALS, artifacts such as test suites and results are stored in the SEALS
Repositories. Every artifact is described using metadata in RDF/OWL. Generic
metadata such as artifact version, name and description are associated with every
artifact, however the metadata can be specialized for different types of tools. In
particular, in this Section we describe the ontologies used to represent test suites and
results for SWS discovery (see also [2]). The ontologies will be made publicly
available at http://www.seals-project.eu/ontologies/.
   The terminology used to describe a SWS discovery test data suite is provided by
the DiscoveryTestSuite ontology, as graphically represented in Figure 1. The
DiscoveryTestSuite ontology extends the generic TestSuite ontology defined in [3].
The main class DiscoveryTestSuite represents a discovery test suite and is a subclass
of the TestSuite class. DiscoveryTestSuite can be described by properties of TestSuite,
such as the hasTestSuiteVersion, modelled here using the DiscoveryTestSuiteVersion
class. The DiscoveryTest class represents a discovery test and is a subclass of the Test
class. The property belongsToDiscoveryTSV indicates the discovery test suite version
to which the discovery test belongs. The MatchTest class provides the goals and
services for a match test using the properties usesGoalDocument and usesService
Document. A discovery test includes a set of match tests. The property belongsTo
DiscoveryTest indicates the discovery test suite to which the match test belongs. The
ServiceDocument class represents a service description document, described by the
properties hasServiceName (name of the service), hasRepresentationLanguage
(language in which the service description is represented) and isLocatedAt (URI of
the document). The GoalDocument class represents a goal document, which can be
described by similar properties.

             tso:TestSuite                tso:TestSuiteVersion
                                                                                             isLocatedAt
    rdfs:subClassOf                 rdfs:subClassOf

                                       DiscoveryTestSuiteVersion                            hasGoalName
          DiscoveryTestSuite

                                          rdfs:range
                                                                                     hasRepresentationLanguage

        tso:Test                        belongsToDiscoveryTSV
                                                                                   rdfs:domain
                                                                                                      GoalDocument
                                        rdfs:domain
                                             DiscoveryTest                                                       rdfs:range
        rdfs:subClassOf                 rdfs:range

                                        belongsToDiscoveryTest
                                        rdfs:domain
                      rdfs:domain                                    rdfs:domain
                                                  MatchTest

         usesServiceDocument                                                       usesGoalDocument                           rdfs:range

        rdfs:range                  rdfs:domain
                                                              isLocatedAt
           ServiceDocument
                                                                                                    rdfs:range
                                                          hasServiceName                                               xsd:string

                                                     hasRepresentationLanguage


               Figure 1 Graphical Representation of the Discovery TestSuite ontology

   To describe a reference test suite we defined the DiscoveryReferenceTestSuite
ontology (not shown here), which corresponds and extends the ontology in Figure 1
by adding the property hasRelevance Value to the ServiceDocument class.
   In SEALS, the evaluation of SWS tools produces two types of results: raw results,
which are generated by running a tool with specific testdata; and interpretations,
which are the results obtained after the evaluation measures are applied over the raw
results. In particular, for the evaluation of SWS discovery, raw results are represented
according to the DiscoveryResults ontology as shown in Figure 2. A discovery result
contains data produced by checking which services match a goal. The main class
DiscoveryResult is a subclass of TestRawResult and together they specify to which
discovery test suite and tool the match result belongs. In addition, they indicate
whether any problems occurred. The DiscoveryResultData class is a subclass of
TestRawResultData and represents the match result data that is described by the
properties hasGoalDescriptionURI, hasServiceDescription URI, hasMatchDegree
(the match degree, e.g. None, Plugin, Exact, Subsumption) and hasConfidence.
            Figure 2 Graphical representation of Discovery Results ontology

   Interpretations that are produced by interpreting the raw results produced by
Semantic Web Services discovery tools are represented according to Discovery
Interpretation ontology, as shown in Figure 3.


        Figure 3 Graphical representation of the Discovery Interpretation ontology

   This ontology has a format very similar to the Discovery Result ontology. The
important difference is that the DiscoveryInterpretation class refers to the
DiscoveryResult that generated this interpretation results. The Discovery
InterpretationData class represents the discovery measurements from given a goal
and a set of service descriptions. Currently, it is described by the properties
hasPrecisionValue and hasRecallValue, which are decimal values representing the
result of the respective measurements.
4 SWS Evaluation Services

In this section we will describe some of the services available from the SEALS
platform for the evaluation of SWS tools. Currently, these services have been
implemented as Java APIs, but in the future they will be available as Web Services.
These services will be made publicly available at http://www.seals-
project.eu/services/.
    First, in SEALS we use repositories to store and retrieve test data, tools and results
of an evaluation, namely the Test Data Repository, the Tools Repository and the
Results Repository. Dedicated services called repository managers handle the
interaction with the repositories and process metadata and data defined for SWS tool
evaluation. More generic services (e.g. retrieveTestDataSet, registerRawResult,
registerInterpretation) are used to access or store files (RDF or ZIP files) using REST
clients; and more specific services (e.g. extractGoals, extractServiceDescriptons) are
used to extract metadata content.
    Second, we have developed a number of services in order to compute
measurements for SWS discovery. Evaluation measures for SWS discovery will
follow in general on the same principles and techniques from the more established
Information Retrieval (IR) evaluation research area. Therefore we will use some
common terminology and refer to common measures (as a reference see [4]). In the
Java API, DiscoveryMeasurement is the main class, which returns metrics results for
a given Discovery Result and Reference Set corresponding to the same goal. This
class also returns overall measures for a list of goals. The class Discovery Result is
part of the SWS plugin API (Section 5). The class DiscoveryReferenceSet contains the
list of service judgments (class DiscoveryJudgement) for a specific goal, which
includes the service description URI, and the relevance value. The relevance value is
measured against the match degree returned by a tool. The class MetricsResult will
contain a list of computed measure values such as precision and recall and also some
intermediate results such as the as number of returned relevant services for a goal.


5 SWS Plugin API

In SEALS we provide the SWS plugin API, which must be implemented by tool
providers participating in the SWS tool evaluation. As mentioned in Section 2, the
SWS Plugin API (available from the campaign website) has been derived from the
SEE API (see also [1] [2]) and works as a wrapper for SWS tools, providing a
common interface for evaluation.
   The Discovery Interface has 3 methods (init(), loadServices(), discover()) and
defines a class for returning discovery results. The methods are called in different
steps of the evaluation workflow (Section 6.1). The method init() is called once after
the tool is deployed so that the tool can be initialized. The method loadServices() is
called once for every dataset during the evaluation (loop) so that the list of services
given as arguments can be loaded. The method discover() is called once for every
goal in the dataset during the evaluation (loop) so that the tool can find the set of
services that match the goal given as argument. The return type is defined by the class
DiscoveryResult. The class DiscoveryResult contains the goal and the list of service
matches (class Match). The class Match contains the service description URI, the
order (rank), the match degree ('NONE', 'EXACT', 'PLUGIN', 'SUBSUMPTION') and
the confidence value. It is expected that the services that do not match are returned
with match degree 'NONE' (assumed value is 0). 'EXACT' assumed value is 1.0.
'PLUGIN' and 'SUBSUMPTION' assumed value is 0.25.


6 SWS Discovery Evaluation Scenario

In the first SEALS evaluation campaign we will run the SWS discovery evaluation
scenario12. Basically, a participant will register a tool via the Web interface provided
in the SEALS website13 and then he will be able to upload and edit his tool as part of
an evaluation campaign scenario. Participants are also required to implement the SWS
Tool plugin API as presented in Section 5. The organizers will make available
instructions for the participants about the scenario and will perform the evaluation
automatically by executing the workflow provided in the next section. The results will
be available in the Results repository.


6.1 SWS Discovery Evaluation Workflow

In this section we describe the evaluation workflow for SWS discovery. The
workflow performs access to the SEALS repositories in order to obtain the
appropriate artifacts as well as access to available services for testing tools and
applying measures. Also, the artifacts retrieved from the repositories can be metadata
from which we extract the appropriate information.
    The overall basic steps in the workflow are: find the evaluation description of a
specific campaign scenario; extract the information about the tools and datasets from
this description; then in a loop, execute each tool with the provided dataset; compute
metrics (e.g. precision and recall) based on the provided reference set and raw results
obtained, and finally the store both raw results and interpretations. In the following
we describe in more details the service operations in the workflow. The high-level
fragment of the actual java code implementation can be found in [2].
    The retrieveEvaluationDescription operation accesses the Evaluation Repository
and retrieves the discovery evaluation description corresponding to a given SWS
Discovery evaluation campaign scenario. The extractTools operation extracts from
the evaluation description metadata, the list of tools (ids) to be evaluated. We iterate
over this list of tools, first checking whether the tool is deployed. The
extractTestDatasets operation extracts from the evaluation description metadata, the
list of datasets (ids) to be used in the evaluation. We iterate over this list of URIs, first
retrieving each dataset from the testdata repository in operation retrieveTestDataSet.
The extractServices operation extracts from the retrieved dataset, the list of service
descriptions (URIs) to be used in the evaluation; and the extractGoals operation

12 http://www.seals-project.eu/seals-evaluation-campaigns/semantic-web-services
13 http://www.seals-project.eu/registertool
extracts the list of goals (URIs). The runTool operation runs the current tool with the
current goal and services from the retrieved dataset. This operation will invoke the
operations loadServices and discover from the SWS plugin implemented by the tool.
The content of DiscoveryResult is serialized into the raw result for the current goal.
This raw result is added to the list of raw results for the current dataset of the current
tool in operation addItemToRawResult. The extractReferenceSet operation extracts
from the retrieved dataset, the reference set to be used in the evaluation. The
computeMeasurements operation computes all measurements (e.g. precision, recall)
for the current goal using the current raw result and reference set. The content of
MetricsResult is serialized into the interpretation for the current raw result. This
interpretation is added to the list of interpretations for the current dataset of the
current tool in operation addItemToInterpretation. The registerRawResult operation
registers the accumulated list of raw results for the current dataset and tool into the
results. The registerInterpretation operation registers the accumulated list of
interpretations for the current dataset and tool into the results repository.


6.2 Tools and Test datasets

In Table 1 we list a number of tools that are publicly available and are candidates for
evaluation using the SEALS platform under a SWS Discovery evaluation campaign.

            Table 1 Candidate tools for SEALS SWS Discovery evaluation campaign
  Tool Name               Provider         Webpage
  OWLS-MX                 DFKI, Germany    http://projects.semwebcentral.org/projects
  (with variants)                          /owls-mx/
  SAWSDL-MX               DFKI, Germany    http://projects.semwebcentral.org/projects
  (with variants)                          /sawsdl-mx/
  Glue2                   CEFRIEL, Italy   http://sourceforge.net/projects/ glue2
  IRS-III                 KMI, The Open    http://kmi.open.ac.uk/ technologies/irs
  (Discovery)             University, UK
  WSMX                    STI Innsbruck,   http://www.wsmx.org/
  (Discovery)             Austria

We have registered two test suites to the Test Data Repository14. The latest collections
corresponding to OWLS-TC and SAWSDL-TC are accessible at http://
seals.sti2.at/tdrs-web/testdata/persistent/OWLS-TC/4.0 and http://seals.sti2.at/tdrs-
web/testdata/persistent/SAWSDL-TC/3.0 respectively. Depending on the value of the
HTTP Accept for the two URLs above, either the metadata or the data is retrieved. To
retrieve the metadata of the test suite version, set the HTTP Accept value to
"application/rdf+xml". To retrieve the actual data as a ZIP file, set the HTTP Header
value to "application/zip".


14 http://seals.sti2.at/tdrs-web/
7 Conclusions

In this paper we have described the ongoing approach and services for SWS tools
evaluation using the SEALS platform. We have described the implementation of the
SWS Discovery Evaluation Workflow. As part of the workflow, we have
implemented services (java code) for accessing the testdata and result repositories as
well as services (java API) for performing measures for SWS Discovery evaluation.
We have created metadata definitions (RDF/OWL) for testdata and results (raw
results and interpretations) and corresponding services (generating and adding
ontology instances to repositories). The SWS plugin API is currently very similar to
SME’s matchmaker plugin in what concerns discovery (matchmaking). However, the
former will be extended in order to account for other activities such as composition,
mediation and invocation. There are also many similarities in purpose between our
approach and existing initiatives in that they all intend to provide evaluation services
and promote discussion on SWS technologies within the SWS community. In this
case, the main difference is that SEALS is investigating the creation of common and
sharable metadata specifications for testdata, tools and evaluation descriptions as well
as respective public repositories.
   The work presented in this paper will be used during the SWS Tools Evaluation
Campaign 2010, available at http://www.seals-project.eu/seals-evaluation-
campaigns/semantic-web-services. This campaign will run the SEALS Semantic Web
Service Discovery Evaluation scenario. Instructions for participants will be made
available. In addition, we will release the SWS plugin API, which must be
implemented for the participating tools. Currently, the last versions of the existing
data collections OWL-S TC and SAWSDL-TC have been stored in the dataset
repository. For future campaigns we plan to release datasets described using other
languages such as WSMO-Lite as well as datasets for other problem scenarios such as
the ones in the SWS Challenge. For the first campaign, the uploading of participant
tools will be manual and there will be no access to the evaluation repository. The
SWS evaluation workflow mentioned before will be performed by the organizers over
the participating tools. Future work includes developing the APIs as Web Services
and implementing the evaluation workflow as a BPEL process.

Acknowledgments This work has been partially funded by the European
Commission under the SEALS project (FP7-238975).


References

1.   Cabral, L., Kerrigan, M. and Norton, B.: D14.1. Evaluation Design and Collection of Test
     Data for Semantic Web Service Tools. Technical report, SEALS Project, March 2010.
2.   Cabral, L., Toma, I., Marte, A.: D14.2. Services for the automatic evaluation of Semantic
     Web Service Tools v1. Technical report, SEALS Project, July 2010.
3.   Garcia-Castro, R., Esteban-Gutierrez, M., Nixon, L., Kerrigan, M. and Grimm, S.: D4.2.
     SEALS Metadata. Technical report, SEALS Project, Feb 2010.
4.   Küster, U., Koenig-Ries, B.: Measures for Benchmarking Semantic Web Service
     Matchmaking Correctness. In Proceedings of ESWC 2010. LNCS 6089. June, 2010.