<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Supporting Consumers in Providing Meaningful Multi-Criteria Judgments</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Friederike Klan</string-name>
          <email>friederike.klan@uni-jena.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Birgitta König-Ries</string-name>
          <email>birgitta.koenig-ries@uni-jena.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computer Science, Friedrich-Schiller-University of Jena</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The huge amount of products and services that are available online, makes it di cult for consumers to identify o ers which are of interest to them. Semantic retrieval techniques for Web Services address this issue, but make the unrealistic assumption that o er descriptions describe a service's capabilities correctly and that service requests re ect a consumer's actual requirements. As a consequence, they might produce inaccurate results. Alternative retrieval techniques such as collaborative ltering (CF) mitigate those problems, but perform not well in situations where consumer feedback is scarce. As a solution, we propose to combine both techniques. However, we argue that the multi-faceted nature Web Services imposes special requirements on the underlying feedback mechanism, that are only partially met by existing CF solutions. The focus of this paper is on how to elicit consumer feedback that can be e ectively used in the context of Web Service retrieval and how to support users in that process. Our main contribution is an algorithm that suggests which service aspects should be judged by a consumer. The approach e ectively adjusts to user's ability and willingness to provide judgments and ensures that the provided feedback is meaningful and appropriate in the context of a certain service interaction.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Categories and Subject Descriptors</title>
      <p>Copyright is held by the author/owner(s). Workshop on the Practical Use of
Recommender Systems, Algorithms and Technologies (PRSAT 2010), held
in conjunction with RecSys 2010. September 30, 2010, Barcelona, Spain.</p>
    </sec>
    <sec id="sec-2">
      <title>1. INTRODUCTION</title>
      <p>
        The huge amount and heterogeneity of information,
products and services that are available online, makes it di cult
for consumers to identify o ers which are of interest to them.
Hence, new techniques that support users in the product
search and selection process are required. In the past decade,
semantic technologies have been developed and leveraged to
approach this issue [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. They provide information with a
well-de ned and machine-comprehensible meaning and thus
enable computers to support people in identifying relevant
content. This idea is not restricted to information, but also
applies to functionality provided via the web as services.
Semantic Web Services (SWS) provide a speci c
functionality semantically described in a machine-processable way
over a well-de ned interface. Similarly, service requesters
may semantically express their service requirements.
Having both, a semantic description of a consumer's needs as
well as the published semantic descriptions of available Web
Services, suitable service o ers can be automatically
discovered by comparing (matching) the given service request with
available o er descriptions. Services might be automatically
con gured and composed and nally invoked over the web.
      </p>
      <p>Existing semantic matchmaking and service selection
approaches evaluate the suitability of available service o ers
exclusively by comparing the published o er descriptions
with a given request description. They implicitly assume
that o er descriptions describe a service's capabilities
correctly and that service requests re ect a consumer's actual
requirements. The rst assumption might have been valid in
a market with a small number of well-known and accredited
companies. However, it is no longer true in today's market,
where easy and cheap access to the Internet and the
emergence of online marketplaces that o er easy to set up
online storefronts enable virtually everyone to provide his own
online shop accessible to millions of buyers. The situation
becomes even more critical, since due to the huge number of
o ers, a hard competition and price war has been aroused
that might cause some providers to promise more than they
are able to provide. In our mind, the assumption that
service requests re ect a consumer's actual requirements is also
not realistic. This is due to the fact that, though SWS
approaches provide adequate means to semantically describe
service needs, they require the user to do this at a formal,
logic-based level that is not appropriate for the average
service consumer in an e-commerce setting. As a result, SWS
applications typically provide request templates for common
service needs. Those templates are then adjusted to t to a
consumer's requirements in a certain purchasing situation.
Though the resulting service requests might be a good
estimate of a consumer's service needs, they cannot exactly met
his true requirements. As a consequence, service discovery
mechanisms that are purely based on the comparison of
semantic request and o er descriptions might produce
inaccurate results and thus lead to suboptimal service selection
decisions.</p>
      <p>
        To mitigate those problems, alternative retrieval techniques
such as collaborative ltering [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] have been developed. Those
techniques do not rely on explicit models of consumer
requirements and product properties. They evaluate product
ratings of neighboring users, i.e. those that have a similar
taste, to recommend products or services that might be of
interest to a potential consumer. Though collaborative
ltering approaches are very e ective in many domains, they
lack the powerful knowledge representation and
matchmaking capabilities provided by SWS and thus perform not well
in situations where feedback is scarce [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. As a solution, we
propose to combine both techniques. More speci cally, we
suggest to perform retrieval based on semantic service
descriptions and then use a collaborative feedback mechanism
to verify and re ne those results. We think, that such a
hybrid approach can bene t from the best of both worlds
and thus has the potential to signi cantly improve the
retrieval quality. Combining semantic retrieval with
collaborative feedback mechanisms is not new (see for example
[
        <xref ref-type="bibr" rid="ref11 ref8">8, 11</xref>
        ]). However, we argue that simply re-using existing
techniques, as done in other approaches, will not tap the
full potential of this type of approach. This is due to the
fact, that the multi-faceted nature and the peculiarities of
SWS impose special requirements on the underlying
feedback mechanism and in particular on the properties of the
consumer feedback that is required. In this paper, we will
analyze those requirements (Sect. 2) and will show that they
are only partially met by existing collaborative ltering
solutions (Sect. 3). The focus of this paper is on how to elicit
consumer feedback that can be e ectively used in the context
of SWS retrieval and how to support users in that process
(Sects. 4 and 5). Our main contribution is an algorithm
that suggests which service aspects should be judged by a
consumer (Sect. 6). The approach accounts for a user's
ability and willingness to provide judgments and ensures that
the provided feedback is meaningful and appropriate in the
context of a certain service interaction. Our evaluation
results show that the proposed procedure e ectively adjusts to
a consumer's personal judgment preferences and thus
provides helpful support for the process of feedback elicitation
(Sect. 7). A detailed discussion on how to e ectively use
consumer feedback to enhance SWS retrieval is published in
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>REQUIREMENTS</title>
      <p>
        Various collaborative ltering mechanisms that allow to
retrieve products or services that are of interest to a
consumer [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] have been proposed. Those mechanisms are very
e ective in many domains and seem to be very promising in
the context of our work. However, we argue that the
multifaceted nature of SWS imposes special requirements on the
underlying feedback mechanism, that are only partially met
by existing CF solutions. In the following, we will specify
those requirements.
      </p>
      <p>Consumer feedback is subjective, since it re ects a
service's suitability as perceived through a certain consumer's
eyes. Hence, feedback is biased by personal expectations and
preferences about the invoked service. Moreover, feedback
may refer to di erent services and to di erent request
contexts. For example, a ticket booking service might have been
used to buy group tickets for a school class or to buy a
single ticket. However, the suitability of a service might di er
depending on the request context and hence the resulting
feedback also does. Feedback mechanisms should account
for those facts. To enable e ective usage, feedback has to
be meaningful, i.e., the expectations and the context
underlying a judgment should be clear. In addition, it should be
evident whether and how feedback made under one
circumstance can be used to infer about a service's suitability in
another situation.</p>
      <p>
        We would also like to emphasize the necessity of feedback
to be as detailed as possible, i.e. comprising of judgments
referring to various aspects of a service interaction. This is for
several reasons. Firstly, feedback, judging the quality of a
provided service as a whole, is of limited signi cance, since as
an aggregated judgment it provides not more than a rough
estimate of a service's performance. Secondly, aggregated
feedback tends to be inaccurate. This is due to the fact, that
humans are bad at integrating information about di erent
aspects, as they appear in a multi-faceted service
interaction, in particular if those aspects are diverse and
incomparable [
        <xref ref-type="bibr" rid="ref10 ref2">2, 10</xref>
        ]. Finally, it has been shown in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] that using
detailed consumer feedback allows to estimate user taste's
more accurately and thus can signi cantly improve
prediction accuracy. In the context of detailed, i.e. multi-criteria,
consumer feedback, meaningful also means that the
relationship between di erent service aspects that might have been
judged is clear and that all relevant aspects characterizing
a certain service interaction have been judged. The latter is
due to the fact, that inferred judgments based on incomplete
information might be incorrect.
      </p>
      <p>Another problem we encounter is feedback scarcity. Given
certain service requirements, a certain context and a
particular service, feedback for exactly this set-up is rare and
typically not available at all. Hence, scarce feedback has to
be exploited e ectively. In particular, service experiences
related to di erent, but similar contexts and those related
to other, but similar services have to be leveraged. However,
unfolding the full potential of consumer feedback, in
particularly when using multi-aspect feedback, requires that users
provide useful responses. To ensure this, the feedback
elicitation process should be assisted. In particular, it should
be taken care that elicited feedback is comprehensive and
appropriate in the context of a certain service interaction.
In addition, a consumer's willingness to provide feedback
as well as his expertise in the service domain should be
accounted for. This is important, since asking a consumer for
a number of judgments he is not able and/or not willing to
provide will result in no or bad quality feedback. Finally,
it should also be ensured that all relevant information that
are necessary for e ectively exploring consumer feedback are
recorded. This should happen transparently for the user.</p>
      <p>Since the type of service interactions to be judged and
the kind of users that provide feedback are diverse and not
known in advance, even for a speci c area of application, a
hard-wired solution with prede ned service aspects to judge
is inappropriate. In fact, the process of feedback elicitation
should be customizable and should be automatically con
gurable at runtime.</p>
    </sec>
    <sec id="sec-4">
      <title>RELATED APPROACHES</title>
      <p>
        Aspects such as feedback scarcity and subjectivity of
consumer feedback are typically addressed in existing
collaborative ltering solutions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Also, dealing with the
contextdependent nature of judgments has been an issue (see e.g.
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]). However, existing solutions only partially address the
question of how to e ectively use judgments made in one
context to infer about a service's suitability in another
context. Multi-criteria feedback has been an issue in both
academic [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and commercial recommender systems. Typically,
the set of aspects that might be judged by a consumer is
either the same for all product types or speci c per product
category. However, in the rst case, this set of aspects is
either very generic, i.e. not product-speci c, or not
appropriate for all products. In the second case, this set has to
be speci ed manually for each new product. Moreover,
typically the single aspect ratings are supplementary in the sense
that they do not have any in uence on a product's overall
rating. Alternatively, some reviewing engines such as those
provided by Epinions 1or Powerreviews2, o er more exible
reviewing facilities based on tagging. Those systems allow
consumers to create tags describing the pros and contras of
a given product. These tags can then be reused by other
users. Tagging provides a very intuitive and exible
mechanism that allows for product-speci c judgments. However,
the high exibility of the approach is at the cost of the
judgments' meaningfulness. This is due to the fact that tags do
not have a clear semantics. In particular, the relationship
between di erent tags is unknown and thus makes them
incomparable. Moreover, those systems do not ensure that
all relevant aspects of a product or a service interaction are
judged. To summarize our ndings, more exible and
adaptive mechanisms to elicit and describe multi-criteria
feedback are required. In particular, the question of how to
describe this type of feedback meaningfully has been hardly
considered. To the best of our knowledge, the issue of
assisting consumers in providing comprehensive, appropriate and
meaningful feedback has not been addressed at all. Also,
aspects such as a consumer's ability and willingness to provide
judgments for speci c aspects have been hardly considered
in existing solutions.
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. SEMANTIC WEB SERVICE RETRIEVAL</title>
      <p>
        As a basis for further discussion, we introduce the
semantic service description language DSD (DIANE Service
Description) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and its mechanisms for automatic semantic
service matchmaking that underlie our approach. Similarly
to other service description approaches, DSD is
ontologybased and describes the functionality a service provides as
well as the functionality required by a service consumer by
means of the precondition(s) and the set of possible e ect(s)
of a service execution. In the service request depicted in
Fig. 1, the desired e ect is that a product is owned after
service execution. A single e ect corresponds to a
particular service instance that can be executed. While service
o er descriptions describe the individual service instances
that are o ered by a service provider, e.g. the set of mobile
phones o ered by a phone seller, service request
descriptions declaratively characterize the set of service instances
that is acceptable for a consumer. In the service request
1http://www.epinions.com
2http://www.powerreviews.com
in Fig. 1, acceptable instances are mobile phones that are
cheaper than 50$, are either silver or black, are of bar or
slider style and are from either Nokia or Sony Ericsson. As
:ServiceProfile
effect
      </p>
      <p>Owned
product</p>
      <p>Product
currency
Currency
==usd
...</p>
      <p>price</p>
      <p>Price
Double
&lt;=50
can be seen in the example, DSD utilizes a speci c
mechanism to declaratively and hierarchically characterize
(acceptable) sets of service e ects: Service e ects are described
by means of their attributes, such as price or color. Each
attribute may be constrained by direct conditions on its
values and by conditions on its subattributes. For instance,
the attribute phoneType is constrained by a direct condition
on its subattribute manufacturer, which indicates that only
mobile phones from Nokia or Sony Ericsson are acceptable.
The direct condition &lt;= 50 on the price amount in Fig. 1
indicates that only prices lower than 50$ are acceptable.
Attribute conditions induce a tree-like and more and more
ne-grained characterization of acceptable service e ects. A
DSD request does not only specify which service e ects are
acceptable, but also indicates to which degree they are
acceptable. In this context, a preference value from [0; 1] is
speci ed for each attribute value. The default is 1:0 (totally
acceptable), but alternative values might be speci ed in the
direct conditions of each attribute. For example, the
preference value for the attribute manufacturer is 1:0 for Nokia
phones and 0:8 for mobile phones from Sony Ericsson.</p>
      <p>
        As demonstrated in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], DSD service and request
descriptions can be e ciently compared. Given a service request,
the semantic matchmaker outputs an aggregated overall
preference value 2 [0; 1] for each available service o er
description. This value is called matching value and indicates how
ell a considered service o er ts to a consumer's
requirements encoded in the service request. Based on the
matching values, the best tting service o er is determined and
invoked.
5.
      </p>
    </sec>
    <sec id="sec-6">
      <title>FEEDBACK ELICITATION</title>
      <p>
        In the following, we will analyze what is required to make
detailed consumer feedback meaningful, comprehensive and
appropriate to characterize a certain service interaction. We
will demonstrate how semantic service descriptions can be
used to elicit feedback that ful lls those requirements. A
detailed discussion on how to e ectively use the elicited
consumer feedback to enhance SWS retrieval is out of the scope
of this paper and is published in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>What is required to make consumer feedback
appropriate, comprehensive and meaningful.</p>
      <p>We assume, that a service request at least covers all service
aspects that are important to the consumer. Potentially, all
service aspects in a request description might be rated by
a consumer. In order to be able to exploit these ratings,
we need to make sure that they are meaningful (i.e.,
contain the rating context, e.g., which product a rating refers
to) and comprehensive (i.e., contain all relevant aspects, a
quality rating without information whether the price was ok
is not helpful). In addition, we need to know how di erent
service aspects relate to each other (e.g., how can a rating
about quality be gained from ratings on subaspects such as
usability and battery capacity?). The challenging question
is how to ful ll the identi ed requirements while still being
exible in the choice of the aspects to rate.</p>
      <p>Creating appropriate, comprehensive and meaningful
consumer feedback.</p>
      <p>We propose the concept of a feedback structure to deal
with that issue. A feedback structure is a subtree of the
request tree, whose leaves correspond to the aspects that
may be rated by the user. Consider the example request
depicted in Fig. 1. The dotted part of the tree indicates
a possible feedback structure for that request, where the
aspects price, battery, style, color and phoneType have to be
rated by the consumer. Note that this structure contains
all information that are necessary to e ectively utilize the
provided ratings. In particular, it encodes the context of
a rating in terms of the path from the request root to the
rated aspect, the other aspects that were judged and the
hierarchical relationship between the considered aspects.</p>
      <p>
        To assure that the provided feedback is comprehensive,
the request subtrees rooted at the feedback structure's leaves
should cover all leaves of the tree. This guarantees that all
service aspects considered in the request description are
either directly or indirectly (by providing an aggregated
rating) judged by the service consumer. The feedback
structure depicted in Fig. 1 ful lls this requirement and thus is
valid. Omitting, e.g., the aspect phoneType would result in
an invalid structure. Note, that we are still exible in the
choice of the attributes to be rated, e.g. we could allow the
consumer to provide a single rating for productType instead
of asking him to judge battery, style, color and phoneType
separately. The feedback structure together with the
consumer provided ratings are propagated to other consumers
and might be used to infer knowledge about a service's
suitability for consumers with other service requirements (see
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for details).
      </p>
    </sec>
    <sec id="sec-7">
      <title>RECOMMENDING WHAT TO JUDGE</title>
      <p>To ensure feedback quality, the feedback elicitation
process should be assisted and should account for a consumer's
judgment preferences such as his willingness to provide
ratings as well as his expertise in the considered service domain.
However, those judgment preferences might di er from
request to request, e.g. I might be an expert in judging the
quality of Personal Computers, but I do not know that much
about servers. As a consequence, I like to/I'm able to judge
the quality of a purchased computer, in case of a PC, but
I'm not willing to do that when purchasing a new server
for our working group. This aspect should be considered
during feedback elicitation. To achieve this, we propose the
following solution.</p>
      <p>Assume, that given a certain service request, an
appropriate service was selected and invoked and now its
suitability has to be judged by the consumer. In a rst step,
we utilize the provided service request to determine possible
feedback structures as de ned in the previous section.
Subsequently, the structure that is most suitable for the user,
i.e. in the context of the given request, ts best to the
consumer's personal abilities and judgment preferences, is
selected and presented to the user. The required knowledge
about the user's judgment requirements is learned from the
his behavior in previous judgment sessions. The presented
feedback structure represents a careful compromise between
the consumer's competing judgment requirements and might
be adjusted to his actual judgment needs. This can be done
by expanding and/or hiding subtrees of the presented
structure. For example, in the structure depicted in Fig. 1, we
might expand the leaf phoneType to judge its subaspects
manufacturer and model. Finally, the user judges all leaf
attributes of the structure, e.g. by providing a rating. Once,
the consumer submits his judgments, the system takes care
of storing all relevant feedback information and session data
for future recommendations. In particular, it is recorded
which and how many service aspects were judged by the
consumer and which service request lead to the judgment.
The acquired information are used later on to identify
suitable feedback structures in future judgment sessions.
6.1</p>
    </sec>
    <sec id="sec-8">
      <title>Feedback structure suitability</title>
      <p>Given a consumer's service request, typically many
different feedback structures are possible. However, how to
measure the suitability of each feedback structure to
identify one that ts best to the user's personal abilities and
willingness to provide judgments? We have to consider two
aspects here. Firstly, comprise the feedback structure leaves
of those attributes that the consumer's is able to judge and
secondly, is the consumer willing to judge all those aspects?</p>
      <p>As a measure of a consumer's willingness and ability to
judge a certain service aspect, we us the frequency with
which the user judged this aspect in the past. We also
consider the request context in which an aspect was judged.
More speci cally, we consider how similar the request that
lead to the past judgment is to our request. Let r be the
service request that was posed by the consumer. Then the
consumer's willingness and ability to judge service aspect
a is determined by wa(r) = Pr02Ra sim(r0; r), where Ra
is the set of past service requests that lead to a judgment
of a. The value sim(r0; r) indicates how similar the service
requirements encoded in the past request r0 are to those in
current request r. A detailed discussion on how compute the
semantic similarity of two requests is provided in Sect. 6.3.</p>
      <p>The suitability sattributes(f s; r) of a given feedback
structure fs is determined by the consumer's willingness and
ability to judge its leaf aspects Afs. We propose to compute it
as the sum of its leaf attributes' wi-values.</p>
      <p>Pi2Afs wi(r)
sattributes(f s; r) = Pj2Ar wj(r)
(1)
The term is normalized by dividing it by the sum of the
wj-values of all attributes j 2 Ar that are contained in the
given request r. Hence, sattributes(f s; r) 2 [0; 1].</p>
      <p>To measure a consumer's willingness to judge k = jAfsj
leaf aspects Afs, we compare how similar the past requests
that also led to a judgment of k aspects are to the service
request r posed by the consumer. More speci cally, the
suitability snumber(f s; r) of the feedback structure fs with
respect to the number of service aspects that have to be
judged is determined by
snumber(f s; r) = sim(Rk; r);
(2)
where sim(Sk; r) is the mean request similarity of all past
service requests that lead to a judgment of k aspects. In
cases, where no previous requests lead to a number of k
service aspects to be judged, snumber(f s; r) is determined as
the mean of sim(Rk0 ; r) and sim(Rk00 ; r), where k0 is the
largest k0 &lt; k for which a past request with k0 judgments
exists and k00 is the smallest k00 &gt; k for which a past
request with k00 judgments exists. In case, k0/k00 did not
exist, sim(Rk0; r)/sim(Rk00; r) was assumed to be 1:0/0:0, i.e.
by default feedback structures with a low number of service
aspects to be judged are preferred. Assuming that sim(x; y)
is a value from [0; 1], snumber(f s; r) is also from [0; 1]. The
overall suitability s(f s; r) 2 [0; 1] of a feedback structure fs
in the context of the posed request r is
s(f s; r) =
sattributes(f s; r) +
snumber(f s; r):
(3)
The parameters and with ; 2 [0; 1] and = 1
determine the in uence of the terms sattributes(f s; r) and
snumber(f s; r), respectively. The values and might vary
from user to user. In Sect. 6.4, we will demonstrate how
those values can be learned from a consumer's past judgment
behavior.
6.2</p>
    </sec>
    <sec id="sec-9">
      <title>Determining possible feedback structures</title>
      <p>For a given request, the number of possible feedback
structures might be high, whereas the number of those that have
the potential to be optimal (with respect to their
suitability s(f s; r) for the user) is low. Hence, we require a way
to determine potentially optimal feedback structures e
ectively, i.e. without having to construct all possible
structures. In the following, we propose an algorithm that
performs this task. It constructs potentially optimal feedback
structures recursively and drops non-optimal partial
structures as soon as possible. Fig. 2 shows how the algorithm
{[1,0.3], [2,0.4], [5,0.4],[6,0.4],[3,0.3], [7,0.3]}
{[1,0.3], [2,0.4], [5,0.4],[6,0.4],[3,0.3], [6,0.3], [7,0.3]}
works, exemplary for the service request depicted in Fig. 1.
Each request node is associated with a list of entries, each
corresponding to one of the feedback structures that are
possible for the subtree rooted at that node. Let fs be
one of those structures and let [a; b] be its corresponding
entry. Then a is the number of aspects that have to be
judged in fs and b is sattributes(f s; r), where r is the
request subtree rooted at the considered node. The algorithm
works as follows. First, it initializes each request node's
list with an entry [1; sattributes(f s; r)], where fs is the
feedback structure comprising only of the node itself and r is
the request subtree rooted at the considered node. For an
example, consider Fig. 2. The initial entry in each list is
highlighted. The number within each node indicates the
value sattributes(f s; r), which, for the sake of this example,
is arbitrarily chosen. Starting from the request leaves
(highlighted request nodes), the algorithm recursively computes
lists for all parent nodes. Computing a node's list is done in
three steps. First, the cross product C of the child nodes'
entry sets is computed. For example to determine possible
feedback structures for the product-node (Fig. 2), we have
to determine C = [1; 0:2]; [2; 0:1] [1; 0:2]; [4; 0:2]; [5; 0:2], i.e.
the cross product of the price and productType node's entry
lists. Each element c of C gives rise to an entry [a; b] in the
product-node list, i.e. to a possible feedback structure fs of
this node's subtree. Since a is the number of attributes to
judge in fs, it is computed as the sum of the a values in c.
The suitability b of fs with respect to the selection of
attributes that have to judged is computed as the sum of its
leaf attributes' b values (Formula 1), i.e. the sum of the b
values in c. In a nal step, we prune the computed list. This
is done by keeping only a single entry [a; b] for each di
erent value of a per node, where b = maxfxj[a; x]isinthelistg.
Note, that in doing so, we keep only those feedback
structures that have the potential to be optimal and hence reduce
the length of the node list to at most l, where l is the
number of leaves of the subtree rooted at the considered node.
Finally, we end up with a list for the request root comprising
of entries for all possible feedback structures for the request,
that have the potential to be optimal. Those structures are
compared with respect to their suitability (Formula 3). The
most suitable is selected and presented to the user.
6.3</p>
    </sec>
    <sec id="sec-10">
      <title>Request similarity</title>
      <p>As mentioned earlier, a consumer's judgment preferences
depend on the request context, i.e. the kind of service
interaction, that has to be judged. To allow for a
comparison of the request contexts, in which judgments have been
made in the past, with the current request, we require a
measure for the semantic similarity of two requests, i.e. the
similarity of the service requirements they encode. In this
section, we will propose such a measure. It recursively
computes the similarity sim(r; r0) of two request trees r and r0
by computing the similarity of their root nodes'
ontological type (simtype(root(r); root(r0))) and direct conditions
(simdc(root(r); root(r0))) and the aggregated similarity of
their root nodes' child trees (simattr(root(r); root(r0))). More
speci cally, we de ne sim(r; r0) to be the mean of these three
values. In the remainder of the section, we will explain the
rationale between those three similarity values and
particularize on how to determine them. Possible similarity values
sim(r; r0) are from the interval [0; 1], where a similarity value
of 0:0 means "not similar at all" and a value of 1:0 means
that the service requirements encoded by two requests are
{[1,0.2], [2,0.1]}</p>
      <p>currency
0.05
0.05
0.05</p>
      <p>0.05
{[1,0.05]} {[1,0.05]} {[1,0.05]} {[1,0.05]} {[1,0.05]}</p>
      <p>manufacturer {[1,0.05], [2,0.0]} model
0.0
product</p>
      <p>0.3
price</p>
      <sec id="sec-10-1">
        <title>Determining the type similarity.</title>
        <p>
          The type similarity simtype(n; n0) 2 [0; 1] of two nodes n
and n0 indicates how similar those nodes are with respect
to their ontological type. It is de ned similar to Jaccard's
index [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], that is often used to compare the sample sets,
simtype(n; n0) = jAn \ An0 j
jAn [ An0 j
        </p>
        <p>(4)
where An is the set of attributes de ned for the type of n
and An0 is the set of attributes de ned for the type of n0.</p>
        <p>The type similarity simtype(n; n0) for the root nodes of the
requests depicted in Fig. 3 is jfbjaftbtaetrtye;rpyh;opnheoTnyeTpey;pceo;lcoorl;ostrygljegj =
0:75.</p>
        <p>Determining the similarity of the direct conditions.</p>
        <p>The similarity simdc(n; n0) 2 [0; 1] of two nodes n and n0
indicates how similar those nodes are with respect to their
direct conditions. As mentioned in Sect. 4, direct conditions
restrict acceptable values of a service attribute. For each
kind of direct condition that might be speci ed for a certain
attribute, we de ne a separate similarity measure. For
example, for direct conditions of type IN f: : :g, the similarity
is determined as the quotient of the number of common
values divided by the number of values that are allowed for n
or n0. For direct conditions of type &lt;= x and &gt;= x, the
similarity is calculated as minfx; yg= maxfx; yg, where x is
the upper/lower bound for the values of n and y for those of
n0. Accordingly, if only one of the nodes speci ed a certain
type of direct condition, the similarity is de ned to be 0:0
and if both nodes do not specify any direct conditions, the
similarity is de ned to be 1:0.</p>
        <p>As an example, consider again the requests depicted in
Fig. 3. The Color-nodes both specify a direct condition
of type IN f: : :g. The similarity simdc(ncolor; n0color) with
respect to this direct condition is 1=2 = 0:5. The
Battery-nodes both do not specify any direct conditions, hence
simdc(nbattery; n0battery) = 1:0.</p>
        <p>Determining and aggregating the similarity of the root
nodes’ child trees.</p>
        <p>The similarity value simattr(n; n0) 2 [0; 1] indicates how
similar two nodes n and n0 are with respect to their child
trees. Let A be the set of attributes de ned either for
the type of n, the type of n0 or for both types and let
fsim(ra; ra0)ja 2 Ag be the similarity values for
corresponding attribute subtrees ra and ra0 of n and n0. Again,
inspired by Jaccard's index, the aggregated similarity of two
nodes' child trees is de ned as the sum of the similarity
values fsim(ra; ra0)ja 2 Ag divided by the sum of the maximal
similarity values that can be achieved for each attribute, i.e.
jAj.</p>
        <p>Pa2A sim(ra; ra0)
simattr(n; n0) =</p>
        <p>jAj
Since attributes in A are not necessarily de ned for both, the
type of n and n0, we set sim(ra; ra0) = 0:0, if the attribute
a is not de ned for one type. Attributes in A might also
not be speci ed in one or both of the nodes. If an attribute
a is not speci ed in both nodes, we set sim(ra; ra0) = 1:0,
else, if a is speci ed in just one of the nodes, sim(ra; ra0)
is de ned to be sim(ra; t0) resp. sim(t; ra0), where t is a
(5)
...</p>
        <p>Phone
cedure for the root nodes of the two request fragments
depicted in Fig. 3. The type of r's root node is MobilePhone
and that of r0's root node is Phone. Assume, that the
ontology de nes the attributes battery, phoneType and color
for the type Phone and an additional attribute style for
the type MobilePhone, which is a subtype of Phone. The
similarity simattr(n; n0) of the requests' root nodes n and
n0 is determined by the similarity of their corresponding
child trees for the attributes A = fbattery, phoneType, color,
styleg. The attributes battery and color are speci ed in both
requests, hence the similarity values sim(rbattery; rb0attery)
and sim(rcolor; rc0olor) can be computed by determining the
request similarity for the request subtrees rooted at the
Battery-nodes and the subtrees rooted at the Color-nodes.</p>
        <p>The attribute style is only de ned for the type MobilePhone,
hence sim(rstyle; rs0tyle) = 0:0. The attribute phoneType is
de ned for both types, MobilePhone and Phone, but only
speci ed in r. Hence, rp0honeT ype has to be replaced by a
node t0 having the most generic type de ned for the
attribute phoneType. Let PhoneType be this type. This means
that the type of the node that describes the attribute
phoneType has to be PhoneType or one of its subtypes. Presume
that MobilePhoneType is a subtype of PhoneType. The
similarity sim(rphoneT ype; rp0honeT ype) is determined by
computing sim(rphoneT ype; t0), where rphoneT ype is the subtree of r
rooted at the MobilePhoneType-node.</p>
        <p>As discussed earlier, the parameters and that weight
the in uence of the terms sattributes(f s; r) and snumber(f s; r),
might vary from user to user. In this section, we will
demonstrate how those values can be learned from a consumer's
past judgment behavior. Initially, i.e. without having
information about a user's previous judgment behavior, we
do not know anything about those parameters' values, so
could be any value from the interval [0; 1] and = 1
. Hence, for the purpose of computing the suitability Test runs and results.
s(f s; r) of possible feedback structures, we set to the We performed test runs with di erent judgment
prefermidpoint of this interval, i.e. = 0:5 = . Once hav- ences and di erent sets of requests that were posed during
ing determined the most suitable feedback structure fs, we a sequence of sessions. In a rst series of tests, the requests
present it to the consumer, who has the opportunity to within each sequence of sessions were di erent, but chosen
change it by expanding/collapsing nodes. Finally, the con- from a single (computer) category, e.g. just notebook
resumer provides judgments for the resulting structure's leaf quests. This test setting served as a baseline and was chosen
nodes. Obviously, the resulting feedback structure fs' was to evaluate the performance of our approach in the absence
more suitable to the user than the structure fs that was of any context e ects. We performed three kinds of tests
recommended. Hence, we conclude that s(f s0; r) should be di ering in the judgment preferences of the judging user.
larger than s(f s; r). Using Formula 3, we get that s(f s; r) In test A1, the consumer always judged a certain number
snumber(f s0; r)=sattributes(f s0; r) snumber(f s0; r) &lt; for of aspects. However, the types of aspects that were judged
sattributes(f s0; r) &gt; snumber(f s0; r) and &gt; for sattributes(f s0; r) di ered. In test A2, the user judged a di erent number of
&lt; snumber(f s0; r). Using those information, we can adjust, attributes during each session, but required that the set of
i.e. shrink the range of correspondingly. For example, if attributes to judge contained a certain set of attributes. For
we get &lt; 0:8, we adjust the interval to [0; 0:8). In case, example, a user might require to always judge the price of
the consumer's judgment behavior is inconsistent, e.g. hav- a product, but is also willing to rate other service aspects.
ing 2 (0:5; 0:7), we get &lt; 0:8, we simply ignore those Finally, we performed a test A3, were the consumer had
speinformation. To ensure, that the most recent information ci c requirements on both, the number and kind of aspects
have the most in uence, we process session data in the or- to judge. The tests A1-A3 were performed with request
der of increasing age. sets from di erent categories. The plot depicted in Fig.4
(A2) is representative for all test runs and all types of tests
in this series. It shows the results for test A2 performed
7. EVALUATION with requests from the category digital watches. As can be</p>
        <p>In the evaluation of our approach, we wanted to nd out seen, the adaptation of the recommendation algorithm to
how fast the recommendation algorithm proposed in Sect. 6 the consumers judgment preferences is very fast. The initial
adjusts to di erent judgment preferences. edit distance decreases to 0 after just one session. This is
due to the fact, that request similarity does not play a role
in those tests and hence the values of and can be
arbitrarily chosen. The depicted behavior was observed for all
three kinds of tests (A1-A3).</p>
      </sec>
      <sec id="sec-10-2">
        <title>Test setting.</title>
        <p>For that purpose, we created a set of DSD service requests
covering typical requirements of consumers looking for
computer items from di erent categories, such as desktop PCs,
PDAs, servers, notebooks or organizers. For our tests, we
created 48 service requests, 6 per category. Requests within
each category varied in the selection of attributes that were
speci ed and in the range of attribute values that were
acceptable for the user. All request types shared common
attribute types, e.g. for all kinds of requests an attribute color
and an attribute price could be speci ed.</p>
        <p>Using this requests we performed several tests with a
single test user. The basic procedure for each test was as
follows. Starting with no information about previous judgment
behavior, several judgment sessions were performed.
During each session, one of the 48 requests was selected. After
that, the system proposed a feedback structure using the
algorithm proposed in Sect. 6 with knowledge about the user's
judgment behavior in the previous judgment sessions. After
being provided with the recommended feedback structure,
the user had the opportunity to change this structure. For
that purpose, the consumer was allowed to expand/collapse
feedback structure nodes. By clicking on a particular node,
all its direct children were expanded/collapsed. The quality
of the proposed feedback structure was measured as the edit
distance between the proposed feedback structure and the
actual feedback structure that was used. More formally, we
counted the number of expand/collapse operations the user
had to perform to get the structure whose leaves he nally
judged. The rationale behind this measure is, that the edit
distance is a direct measure of the users e ort to get to the
desired structure and thus, in our opinion, is a good measure
for the quality of the recommended structure. For each of
the test, we looked at whether and how fast the edit distance
decreased with the number of judgment sessions.</p>
        <p>In a second series of tests, we evaluated how fast the
proposed recommendation algorithm adjusts to a consumers
judgment preferences, if those depend on the request
context. For that purpose, we performed judgment sessions,
were the user posed requests from di erent categories and
exhibited a di erent judgment behavior for each category.
We run three types of tests. In test B1, similarly to test
5
session number
5
session number
B3
A1, the user always judged a particular number of aspects.
However, this number di ered for each request category. For
example, a user might always judge 3 aspects when asking
for desktop PCs, but is willing to judge 5 service aspects,
when asking for notebooks. Analogously to test A2, the
user in test B2 required the set of aspects to be judged to
contain a particular set of aspects. However, this set
varied for di erent request categories. Finally, we performed
a test B3, were the consumer had speci c requirements on
both, the number and kind of aspects to judge. Those
requirements were di erent for each request category. Fig. 4
(B1) exemplary shows the results for tests of type B1. In
the depicted test, we alternated sessions based on a request
for a desktop PC (continuous line), where the user judged
11 service aspects, with those based on a request for a PDA
(dotted line), where the consumer judged only one aspect.
As can be seen, the adjustment to the consumer's judgment
preferences for PDAs takes 3 sessions. This is due to the
fact, that at the beginning both terms sattributes and snumber
are equally weighted. Since for desktop PCs many aspects
are judged and since most of those aspects are also shared
by PDA requests, term sattributes dominates the suitability
value and thus favors improper feedback structures. This
changes when and adjust over time. Fig. 5 (B2)
exemplary shows the results for tests of type B2. Again, we
alternated desktop PC requests with those for a PDA. While
when judging desktop PCs, we had a set of two aspects that
had to be judged in any case, it was only one speci c aspect
when judging PDAs. Again, it required 4 sessions to adjust
and appropriately. Finally, Fig. 5 (B3) exemplary shows
the results for tests of type B3. In this test, we alternated
three types of requests (desktop PC, PDA and digital watch
requests). As can be seen, the algorithm propose
appropriate feedback structures after just 1 session of each type.
This is due to the fact, that for the three request types, the
consumer's judgment behavior di ered much in terms of the
number and types of aspects to be judged. Hence, though
and are not yet adjusted, the correct feedback structure
can be identi ed.
8.</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION</title>
      <p>In this paper, we demonstrated how detailed consumer
feedback, that is meaningful and appropriate in the context
of a service interaction, can be elicited and how users can be
supported in that process. Our main contribution is an
algorithm that suggests service aspects that might be judged
by a consumer. Our evaluation results show, that the
proposed procedure e ectively adjusts to a user's ability and
willingness to provide judgments.
9.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          .
          <article-title>Incorporating contextual information in recommender systems using a multidimensional approach</article-title>
          .
          <source>ACM Transactions on Information Systems</source>
          ,
          <volume>23</volume>
          (
          <issue>1</issue>
          ):
          <fpage>103</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Dawes</surname>
          </string-name>
          .
          <article-title>The robust beauty of improper linear models in decision making</article-title>
          .
          <source>American Psychologist</source>
          ,
          <volume>34</volume>
          (
          <issue>7</issue>
          ):
          <volume>571</volume>
          {
          <fpage>582</fpage>
          ,
          <year>1979</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Fensel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lausen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          , J. de Bruijn,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stollberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Domingue</surname>
          </string-name>
          .
          <source>Enabling Semantic Web Services: The Web Service Modeling Ontology</source>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y. K. Gediminas</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          .
          <article-title>New recommendation techniques for multi-criteria rating systems</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>22</volume>
          (
          <issue>3</issue>
          ),
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Jaccard</surname>
          </string-name>
          .
          <article-title>Etude comparative de la distribution orale dans une portion des alpes et des jura</article-title>
          .
          <source>Bulletin de la Societe Vaudoise des Sciences Naturelles</source>
          ,
          <volume>37</volume>
          :
          <fpage>547</fpage>
          {
          <fpage>579</fpage>
          ,
          <year>1901</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Klan</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Ko</surname>
          </string-name>
          nig-Ries.
          <article-title>Enabling trust-aware semantic web service selection - a exible and personalized approach</article-title>
          .
          <source>Jenaer Schriften zur Mathematik und Informatik</source>
          , Math/Inf/02/10,
          <string-name>
            <surname>Friedrich-</surname>
          </string-name>
          Schiller-University Jena,
          <year>August 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>U.</given-names>
            <surname>Ku</surname>
          </string-name>
          <article-title>ster, B. Konig-</article-title>
          <string-name>
            <surname>Ries</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Klein</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Stern</surname>
          </string-name>
          .
          <article-title>Diane - a matchmaking-centered framework for automated service discovery, composition, binding and invocation</article-title>
          .
          <source>In Proceedings of the 16th International World Wide Web Conference (WWW2007)</source>
          , Ban , Alberta, Canada, May
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>U. S.</given-names>
            <surname>Manikrao</surname>
          </string-name>
          and
          <string-name>
            <given-names>T. V.</given-names>
            <surname>Prabhakar</surname>
          </string-name>
          .
          <article-title>Dynamic selection of web services with recommendation system</article-title>
          .
          <source>In Intl. Conf. on Next Generation Web Services Practices</source>
          , pages
          <volume>117</volume>
          {
          <fpage>121</fpage>
          , Washington, DC,
          <year>2005</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Schafer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Frankowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Herlocker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Sen</surname>
          </string-name>
          .
          <article-title>Collaborative ltering recommender systems</article-title>
          . In P. Brusilovsky,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kobsa</surname>
          </string-name>
          , and W. Nejdl, editors,
          <source>The Adaptive Web: Methods and Strategies of Web Personalization</source>
          , volume
          <volume>4321</volume>
          of Lecture Notes in Computer Science, pages
          <volume>291</volume>
          {
          <fpage>324</fpage>
          . Springer, Berlin, Heidelberg,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Slovic</surname>
          </string-name>
          .
          <article-title>Limitations of the Mind of Man: Implications for decision making in the nuclear age</article-title>
          . Los Alamos Scienti c Laboratory,
          <year>1972</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H. C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Ho</surname>
          </string-name>
          .
          <article-title>Combining subjective and objective qos factors for personalized web service selection</article-title>
          .
          <source>Expert Syst. Appl.</source>
          ,
          <volume>32</volume>
          (
          <issue>2</issue>
          ):
          <volume>571</volume>
          {
          <fpage>584</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>