<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Inference of performance annotations in Web Service composition models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antonio García-Domínguez</string-name>
          <email>antonio.garciadominguez@uca.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Inmaculada Medina-Bulo</string-name>
          <email>inmaculada.medina@uca.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mariano Marcos-Bárcena</string-name>
          <email>mariano.marcos@uca.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Languages and Systems, University of Cádiz</institution>
          ,
          <addr-line>C/Chile 1, CP 11003, Cádiz</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Mechanical Engineering and Industrial Design, University of Cádiz</institution>
          ,
          <addr-line>C/Chile 1, CP 11003, Cádiz</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>High-quality services must keep working reliably and efficiently, and service compositions are no exception. As they integrate several internal and external services over the network, they need to be carefully designed to meet their performance requirements. Current approaches assist developers in estimating whether the selected services can fulfill those requirements. However, they do not help developers define requirements for services lacking performance constraints and historical data. Manually estimating these constraints is a time-consuming process which might underestimate or overestimate the required performance, incurring in additional costs. This work presents the first version of two algorithms which infer the missing performance constraints from a service composition model. The algorithms are designed to spread equally the load across the composition according to the probability and frequency each service is invoked, and to check the consistency of the performance constraints of each service with those of the composition.</p>
      </abstract>
      <kwd-group>
        <kwd>service level agreement</kwd>
        <kwd>load testing</kwd>
        <kwd>UML activity diagrams</kwd>
        <kwd>service oriented architecture</kwd>
        <kwd>service compositions</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Service-oriented architectures have been identified as an effective method to
reduce costs and increase flexibility in IT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Information is shared across the
entire organization and beyond it as a collection of services, managed according
to business needs and usually implemented as Web Services (WS).
      </p>
      <p>
        It is often required to join several WS into a single WS with more
functionality, known as a service composition. Workflow languages such as WS-BPEL
2.0 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or BPMN [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] have been specifically designed for this, letting the user
specify the composition as a graph of activities.
      </p>
      <p>
        Just as any other software, service compositions need to produce the required
results in a reasonable time. This is complicated by the fact that service
compositions depend on both externally and internally developed services and the
network. For this reason, there has been considerable work in estimating the
quality of service (QoS) of a service composition from its tasks’ QoS [
        <xref ref-type="bibr" rid="ref4 ref5">4,5</xref>
        ] and
selecting the combination of services to be used [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>However, these approaches assume that the services already exist and
include annotations with their expected performance. For this reason, they are
well suited with bottom-up approaches to developing service compositions. On
the other hand, they do not fit well in a top-down approach, where the user
defines the composition and its target QoS before that of the services: some of
them might not even be implemented yet. Historical data and formal service
level agreements will not be available for these, and for new problem domains,
the designer might not have enough experience to accurately estimate their QoS.</p>
      <p>Inaccurate QoS estimates for workflow tasks incur in increased costs. If the
estimated QoS is set too low, the workflow QoS might not be fulfilled, violating
existing service level agreements. If it is set too high, service level agreements
will be stricter than required, resulting in higher fees for external services and
higher development costs for internal services.</p>
      <p>In absence of other information, the user could derive initial values for these
missing estimates according to a set of assumptions. For instance, the user might
want to meet the workflow QoS by splitting the load equally over all tasks
according to the probability and frequency they are invoked, requiring as little
performance as possible. Nevertheless, it might be difficult to calculate these
values by hand for complex compositions where some services are annotated.</p>
      <p>In this work we propose two algorithms designed to assist the user in this
process, following the assumptions above. The algorithms derive initial values for
the response time under a certain load of all elements in a service composition
model inspired on UML activity graphs. The algorithms have been successfully
integrated into the models of an existing model-driven SOA methodology.</p>
      <p>The structure of the rest of this text is as follows: in section 2 we describe the
generic graph metamodel both inference algorithms work with. The algorithms
are described in detail in sections 3 and 4. Section 5 discusses the adaptations
required to integrate them into an existing model-driven SOA methodology.
Section 6 evaluates the algorithms and the tools. Finally, related works are listed
and some conclusions are offered, along with an outline of our future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Graph metamodel</title>
      <p>The inference algorithms are defined for a generic graph metamodel. The
metamodel is a simplification of UML activity diagrams, and uses mostly the same
notation. The new global and local performance constraints are represented as
stereotyped annotations.</p>
      <p>
        A simplified UML class diagram for the ECore [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] graph metamodel is shown
in figure 1. Graphs contain a set of FlowNodes and FlowEdges and have
a manual global PerformanceAnnotation, which indicates all paths in the
graph should finish in less than secsTimeLimit seconds while handling
concurrentRequests requests concurrently. There are several kinds of FlowNodes:
Activities encapsulate some behavior, described textually in the attribute name.
      </p>
      <p>Activities can have manual or automatic performance annotations.
Initial nodes are the starting execution points of the graphs. One per graph.
Final nodes end the current execution branch. There can be more than one.
Decision nodes select one execution branch among several, depending on whether
the condition for their outgoing edge holds or not. Only the outgoing FlowEdges
from a DecisionNode may have a non-empty condition and a probability
less than 1.</p>
      <p>Fork nodes split the current execution branch into several parallel branches.
Join nodes integrate several branches back into one, whether they started off
from a decision node or a fork node. This is a simplification from UML,
which uses different elements to integrate each kind of branch: join nodes
and merge nodes, respectively.</p>
      <p>Pe rform a nce Annota tion
concurre ntRe que sts : EDouble Obje ct 0..1
se csTim e Lim it : EDouble Obje ct
m a nua l : EBoole a n
0..1</p>
      <p>Gra ph</p>
      <p>0..*
0..*</p>
      <p>FlowEdge
condition : EString
proba bility : EDouble Obje ct
0..*</p>
      <p>outgoing
0..* incom ing
0..1 ta rge t
FlowNode
0..1
source</p>
      <p>Activity
na m e : EString</p>
      <p>Initia lNode</p>
      <p>Fina lNode</p>
      <p>ForkNode
JoinNode</p>
      <p>De cisionNode</p>
      <p>A sample model using this metamodel is shown in figure 2, which describes
a process for handling an order. Execution starts from the initial node, and
proceeds as follows:
1. The order is evaluated and is either accepted or rejected.
2. If rejected, close the order: we are done.
3. Otherwise, fork into 2 execution branches:
(a) Create the shipping order and send it to the shipping partner.
(b) Create the invoice, send it to the customer and receive its payment.
4. Once these 2 branches complete their execution, close the order.</p>
      <p>There are two manual performance annotations: a global one (using the
&lt;&lt;gpc&gt;&gt; stereotype) and a local one (using &lt;&lt;pc&gt;&gt;). The whole process must
be able to serve 5 concurrent requests in less than 1 second, and evaluating an
order should be done in 0:4 seconds at most with 5 concurrent requests.</p>
      <p>&lt; &lt; gpc&gt; &gt;
concurrentRequests = 5
tim eLim it = 1
m anual = true</p>
      <p>&lt; &lt; pc&gt; &gt;
concurrentRequests = 4
tim eLim it = 0.4
m anual = false
Evaluate</p>
      <p>Order
&lt; &lt; pc&gt; &gt;
concurrentRequests = 5
tim eLim it = 0.4
m anual = true</p>
      <p>[ re je ct e d] ( p = 0 .2 )
[ e lse ] ( p = 0 .8 )</p>
      <p>Create</p>
      <p>Shipping Order
Create
Invoice</p>
      <p>Perform
Paym ent</p>
      <p>Close</p>
      <p>Order
&lt; &lt; pc&gt; &gt;
concurrentRequests = 5
tim eLim it = 0.2
m anual = false
&lt; &lt; pc&gt; &gt;
concurrentRequests = 4
tim eLim it = 0.2
m anual = false</p>
      <p>&lt; &lt; pc&gt; &gt;
concurrentRequests = 4
tim eLim it = 0.2
m anual = false</p>
    </sec>
    <sec id="sec-3">
      <title>Inference of concurrent requests</title>
      <p>The first algorithm infers the number of concurrent requests that must be
handled at each activity in the graph before its time limit in order to meet the
performance requirements of the whole graph. It performs a pre-order
breadthfirst traversal of every node and edge in the graph, starting from the initial node
and caching intermediate results. The algorithm annotates each activity with its
expected number of concurrent requests C(x). The definition of C(x) depends
on the element type:
– InitialNode: C(x) is the value of the attribute concurrentRequests of the
global performance annotation of the graph.
– FlowEdge: C(x) = P (x)C(s), where x is the edge, s is the source node of
x, P (x) is the value of the attribute probability of x and C(s) is the computed
number of concurrent requests for s. Most edges have P (x) = 1, except those
which start from a decision node.
– JoinNode: the least common ancestor LCA(P ) of all the parent nodes
P is computed. This least common ancestor is the point from which the
conditional or parallel branches started off. If LCA(P ) is a decision node,
C(x) = Pp2P C(p), as requests can only visit one of the branches.
Otherwise, it is a fork node, and C(x) = maxp2P C(p), as requests can visit all
the branches at the same time.</p>
      <p>
        To compute the LCA of several nodes, the naive method described by
Bender et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] works best for the sparse graphs which usually result from
modeling service compositions. For a graph with n vertexes and e edges, its
preprocessing step requires O(n + e) operations and each query performed
may require up to O(n2) operations. This only needs to be done once for
each join node.
– Otherwise, C(x) = C(i), where i is its only incoming edge. If the node is an
activity with a manual performance annotation, the inferred value should
match its value. Otherwise, the user will need to confirm if the annotation
should be updated.
      </p>
      <p>Using these formulas, computing C(Create Invoice) for the example shown
in figure 2 would require walking back to the initial node, finding an edge with
a probability of p = 0:8, no merge nodes and a global performance annotation
for G = 5 concurrent requests. Therefore, C(Create Invoice) = pG = 4.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Inference of time limits</title>
      <p>The algorithm required to infer the time limits for each activity in the graph is
more complicated than the previous one. It adds time limits to the activities in
each path from the initial node to any of the final nodes, ensuring the inferred
time limits meet several assumptions.</p>
      <p>This section starts with the requirements imposed on the results of the
algorithm and lists the required definitions. The rest of the section describes the
algorithm and shows an example of its application.
4.1</p>
      <sec id="sec-4-1">
        <title>Design requirements</title>
        <p>The inferred time limits must meet the following requirements:
1. The sum of the time limits of the activities in each path from the initial node
to any of the final nodes must be less than or equal to the global time limit.
2. Available time should be split equally over the activities without any manual
performance annotations. We cannot blindly assume that one activity will
need to be more efficient than another. Ideally, we would like to split evenly
the load over all activities.
3. Time limits should not be lower than strictly required: better performing
systems tend to be more expensive to build and maintain.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Definitions</title>
        <p>To describe the algorithm, several definitions are needed. These definitions
consider paths as sets of activities, rather than sequences of nodes.</p>
        <p>– M (a) is true if the activity a has a manual time limit.
– A(a) is true if the activity a has an automatic time limit.
– U (p) = f j</p>
        <p>a a 2 p ^ :(A(a) _ M (a))g is the set of all activities in the path p
which do not have time limits. These are said to be unrestricted.
– F (p) = f j</p>
        <p>a a 2 p ^ :M (a)g is the set of all activities in the path p which do
not have manual time limits. These are said to be free.
– TL(G) is the manual time limit for every path in the graph G.
– TL(a) is the current time limit for the activity a.
– TL(p) = Pa2P ^(M(a)_A(a)) TL(a) is the sum of the time limits in the path
p.
– TM (p) = Pa2(p F (p)) TL(a) is defined as the sum of the manual time limits
in the path p.
– SA(p; G) = TL(G) TM (p) is the slack which can be split among the free
activities in path p of the graph G.
– TE (p; G) = SA(p; G)=(1 + jF (p)j) is an estimate of the automatic time limit
each free activity in path p would obtain using only the information in that
path. More restrictive paths will have lower values of TE (p; G).
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Description of the algorithm</title>
        <p>The algorithm follows these steps:
1. All automatic time limits are removed.
2. The set P of all paths from the initial node to each final node is calculated.
3. For each path p 2 P in ascending order of TE (p; G), so the more restrictive
paths are visited first:
(a) If SA(p; G) = 0, the condition jF (p)j &gt; 0 is evaluated. If it is true,
there is at least one free activity in p for which no time limit can be set.
The user is notified of the situation and execution is aborted. Otherwise,
processing will continue with the next path.
(b) If SA(p; G) &lt; 0, then TM (p) &gt; TL(G) holds, by definition. This means
that the sum of the manual time limits in p exceeds the global time limit.</p>
        <p>The user is notified of the situation and execution is aborted.
(c) Otherwise, SA(p; G) &gt; 0. If jF (p)j &gt; 0 ^ jU (p)j &gt; 0, all activities in U (p)
will be updated to the time limit (TL(G) TL(p))=jU (p)j.
4.4</p>
      </sec>
      <sec id="sec-4-4">
        <title>Example</title>
        <p>To infer the automatic time limits of the running example shown in figure 2, we
first need to list all existing paths:
– p1 = fEvaluate Order; Close Orderg.</p>
        <p>TE (p1; G) = (1 0:4)=(1 + 1) = 0:3.
– p2 = fEvaluate Order; Create Shipping Order; Close Orderg.</p>
        <p>TE (p2; G) = (1 0:4)=(1 + 2) = 0:2.
– p3 = fEvaluate Order; Create Invoice; Receive Payment; Close Orderg.</p>
        <p>TE (p3; G) = (1 0:4)=(1 + 3) = 0:15.</p>
        <p>p3 has the lowest value for TE , so it is visited first. “Create Invoice”, “Perform
Payment” and “Close Order” are in U (p3), so they are updated to l = (1
0:4)=3 = 0:2.</p>
        <p>p2 is visited next. Only “Create Shipping Order” is in U (p2), so it is updated
to l = (1 0:6)=1 = 0:4.</p>
        <p>p1 is visited last. U (p1) = ;, so nothing is updated, and we are done.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Integration in a SOA methodology</title>
      <p>The algorithms described above work on a generic graph metamodel which needs
to be adapted to the metamodels used in specific technologies and methodologies.
In this section we show how the metamodel and the algorithms were adapted
and integrated into a SOA methodology: SODM+Testing.</p>
      <p>
        Existing methodologies reduce the development cost of SOAs, but do not
take testing aspects into account [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. SODM+Testing is a new methodology that
extends SODM [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] with testing tasks, techniques and models.
5.1
      </p>
      <sec id="sec-5-1">
        <title>Metamodel adaptation</title>
        <p>The original SODM service process and service composition metamodels were
a close match to the generic graph metamodel described in section 2, as they
were also based on UML activity graphs. Service process models described how
a request for a service offered by the system should be handled, and service
compositions provided details on how the tasks were split among the partners
and what messages were exchanged between them.</p>
        <p>SODM+Testing refactors these models using the generic graph metamodel
as their core. Each class in the SODM process and composition metamodels can
now be upcasted to the matching class of the generic graph metamodel both
inference algorithms are based on. For instance, ProcessFlowEdge (class of
all flow edges in the SODM service process metamodel) has become a subclass
of FlowEdge (class of all flow edges in the generic graph metamodel).</p>
        <p>The generic core differs slightly from that in figure 1. FlowEdges are split
into ControlFlows and ObjectFlows, and performance annotations have
also been split into several types according to their scope. In addition, the
SODM+Testing service composition metamodel has been extended so
activities can contain action sub-graphs with their own local and global performance
annotations.
5.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Tool implementation</title>
        <p>
          SODM+T is a collection of Eclipse plug-ins that assists the SODM+Testing
methodology with graphical editors for the SODM+Testing service process and
composition models. It is available under the Eclipse Public License v1.0 [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          To ensure the inference algorithms can be applied, models are validated
automatically upon saving, showing the usual error and warning markers and
Quick Fixes. Validation has been implemented using Epsilon Validation
Language (EVL) scripts [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and OCL constraints.
        </p>
        <p>
          The algorithms are implemented in the Epsilon Object Language (EOL) and
can be launched from the contextual menu of the graphical editor. The EOL
scripts include two supporting algorithms for detecting cycles in graphs and
computing the least common ancestor of two nodes [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Evaluation</title>
      <p>In this section we will evaluate each algorithm, pointing out their current
limitations and how we plan to overcome them. Later on, we will evaluate their
theoretical performance. Finally, we will discuss the current state of SODM+T.
6.1</p>
      <sec id="sec-6-1">
        <title>Inference of concurrent requests</title>
        <p>This algorithm performs a breadth-first traversal on the graph, and assumes the
number of concurrent requests has already been computed for all parent nodes.
For this reason, it is currently limited to acyclic graphs. We intend to extend the
model to allow loops, limiting their maximum expected number of iterations.</p>
        <p>The algorithm requires edges to be manually annotated with probabilities.
These could be simply initialized to assume each branch in each decision node
has the same probability of being activated, in absence of other information.</p>
        <p>The algorithm assumes only the current workflow is being run, thereby
providing a best-case scenario. More realistic estimations would be obtained if
restrictions from all workflows in the system were aggregated.</p>
        <p>Assuming the naive LCA algorithm has been used, this algorithm completes
its execution for a graph with n nodes after O(n3) operations, as its outer loop
visits each node in the graph to perform O(n2) operations for JoinNodes and
O(1) operations for the rest. This is a conservative upper bound, as the O(n2)
time for the naive LCA algorithm is for dense graphs, which are uncommon in
service compositions.
6.2</p>
      </sec>
      <sec id="sec-6-2">
        <title>Inference of time limits</title>
        <p>As there must be a finite number of paths in the graph, the current version
requires the graph to be acyclic, just as the previous algorithm. The algorithm
could reduce paths with loops to regular sequences, where the activities in the
loop would be weighted to take into account the maximum expected number of
iterations.</p>
        <p>The algorithm only estimates maximum deterministic time limits.
Probability distributions are not computed, as it does not matter which path is actually
executed: all paths should finish in the required time.</p>
        <p>In addition, the user cannot adjust the way in which the available slack
is distributed to each task. For instance, the user might know that a certain
task would usually take twice as much time as some other task. Weighting the
activities as suggested above would also solve this problem.</p>
        <p>Finally, like the previous algorithm, the time limit inference algorithm is
currently limited to the best-case scenario where only the current workflow is
being run.</p>
        <p>A formal proof of the correctness of the time limit inference algorithm is
outside the scope of this paper. Nevertheless, we can show that the results produced
by the algorithm meet the three design requirements listed in § 4.1. We assume
the model is valid (acyclic and weakly connected, among other restrictions) and
that the algorithm completes its execution successfully:
1. The first requirement is met by sorting the paths from most to least
restrictive. For the i-th path pi, only the nodes for which pi provides the strictest
constraints (that is, U (pi)) are updated so TL(pi) &lt; TL(G).</p>
        <p>Since the activities from the previous (and stricter) paths were not changed,
TL(pj ) &lt; TL(G); j i. Once all paths have been processed, TL(p) &lt; TL(G)
holds for all p 2 P .
2. The second requirement is met by simply distributing equally the available
slack using a regular division.
3. The third requirement is met by visiting the most restrictive paths first, so
activities obtain the strictest time limits as soon as possible, leaving more
slack for those who can use it.</p>
        <p>The running time of the algorithm is highly dependent on the number of
unique paths in the graph. With no loss of generality, we assume all forks and
decision nodes are binary. If there are b decision and fork nodes in total among
the n nodes, there will be 2b unique paths to be considered.</p>
        <p>Enumerating the existing paths requires a depth-first traversal of the graph,
which may have to consider the O(n2) edges in the graph. Each path in the
graph will have to be traversed a constant number of times to compute TE (p; G),
SA(p; G), jF (p)j and other formulas which require O(n) operations.</p>
        <p>Therefore, the algorithm requires O(n2 + n2b) time. We can see that the
number of forks and decisions has a large impact on the running time of the
algorithm. This could be alleviated by culling paths subsumed in other paths.
We are also looking into alternative formulations which do not require all paths
to be enumerated.
6.3</p>
      </sec>
      <sec id="sec-6-3">
        <title>SODM+T</title>
        <p>SODM+T has been successfully used to model the abstract logic required to
handle a business process from a medium-sized company, generating performance
annotations for graphs with over 60 actions nested in several activities.</p>
        <p>
          However, SODM+T could be improved: future versions will address some of
the issues found during this first experience. WSDL descriptions and test cases
(for tools such as soapUI [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] or JUnitPerf [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]) need to be generated from the
models. Performance annotations should be more fine-grained, letting the user
specify only one of the attributes so the other is automatically inferred.
        </p>
        <p>
          Finally, the tools could be adapted to other metamodels more amenable to
code generation. Business Process Modeling Notation [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] models are a good
candidate, as partial automated translations to executable forms already exist [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
7
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Related work</title>
      <p>
        Several UML profiles for modeling the expected performance of a system have
been approved by the Object Management Group: SPTP [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], QFTP [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and
MARTE [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. To study the feasibility of our approach, we have kept our
constraints much simpler, but we intend to adapt them to either SPTP or QFTP
in the future.
      </p>
      <p>
        Existing works on performance estimation in workflows focus on aggregating
workflow QoS from task QoS, unlike our approach, which estimates task QoS
from workflow QoS. Silver et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] annotated each task in a workflow with
probability distributions for their running times, and collected data from 100
simulation runs to estimate the probability distribution of the running time of
the workflow and validate it using a Kolmogorov-Smirnov goodness-of-fit test.
Simulating the workflow allows for flexibly modeling the stochastic behavior of
each service, but it has a high computational overhead due to the number of
simulation runs that are required. Our work operates directly on the graph,
imposing less computational overhead, but only modeling deterministic maximum
values.
      </p>
      <p>
        Other authors convert workflows to an existing formalism and solve it
using an external tool. The most common formalisms are layered queuing
networks [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], stochastic Petri networks [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] and process algebra specifications [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
These formalisms are backed by in-depth research and the last two have solid
mathematical foundations in Markov chain theory. However, they introduce an
additional layer of complexity which might discourage some users from
applying these techniques. Our work operates on the models as-is, with no need for
conversions. This makes them easier to understand.
      </p>
      <p>
        Finally, some authors operate directly on the workflow models, as our
approach does. The SWR algorithm proposed by Cardoso et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] computes
workflow QoS by iteratively reducing its graph model to a single task. SWR
only works with deterministic values, like our algorithms, but its QoS model is
more detailed than ours, describing the cost, reliability and minimum, average
and maximum times for each service. We argue that though average and
minimum times might be useful for obtaining detailed estimations, they provide little
value for top-down development, which is only interested in establishing a set
of constraints which can be used for conducting load testing and monitoring for
each service. However, cost and reliability could be interesting in this context as
well.
      </p>
      <p>
        It is interesting to note that Cardoso et al. combine several data sources to
estimate the QoS of each task [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]: designer experience, previous times in the
current instance, previous times in all instances of this workflow, and previous
times in all workflows. The designer is responsible for specifying the relative
importance of each data source. Results from our algorithms could be used as
yet another data source.
      </p>
      <p>
        Our metamodels have been integrated into SODM+Testing, a SOA
modeldriven methodology which extends the SODM methodology [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] with testing
aspects. Many other SOA methodologies exist: for instance, Erl proposes a
highlevel methodology focused on governance aspects in his book [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and IBM has
defined the comprehensive (but proprietary) SOMA methodology [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. SODM
was selected as it is model-driven and strikes a balance between scope and cost.
      </p>
    </sec>
    <sec id="sec-8">
      <title>Conclusions and future work</title>
      <p>Service compositions allow several WS to be integrated as a single WS which
can be flexibly reconfigured as business needs change. However, setting initial
performance requirements for the integrated services in a top-down methodology
so the entire composition works as expected without incurring additional costs
is difficult. At an early stage of development, there is not enough information to
use more comprehensive techniques.</p>
      <p>This work has presented the first version of two algorithms which infer
missing information about response times under certain loads from models inspired
in UML activity graphs. Though a formal proof was outside the scope of this
paper, we have found them to meet their design requirements: they infer local
constraints which meet the global constraints, distribute the load equally over
the composition and allow for as much slack as possible.</p>
      <p>
        These algorithms and their underlying metamodel have been successfully
integrated into a model-driven SOA methodology, continuing the previous work
shown in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The tools [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] implement a graphical editor which allows the user to
define performance requirements at several levels and offers automated validation
and transformation.
      </p>
      <p>
        Nevertheless, the algorithms and tools could be improved. The underlying
metamodel should allow for partially automatic annotations and use weights
for distributing the available slack between the activities. In the future, the
annotations could be adapted to the OMG SPTP [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], QFTP [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] or MARTE [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]
UML profiles.
      </p>
      <p>The algorithms will be revised to handle cyclic graphs and aggregate
constraints for a service from several models. The performance of the time limit
inference algorithm depends on the number of unique graphs from the initial
node to each final node in the tree. We are looking into ways to reduce the
number of paths which need to be checked.</p>
      <p>
        Finally, it is planned to adapt the tools to service composition models more
amenable to executable code generation, such as BPMN [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and to generate test
cases for existing testing tools such as soapUI [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] or JUnitPerf [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Erl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>SOA: Principles of Service Design</article-title>
          . Prentice Hall, Indiana,
          <string-name>
            <surname>EEUU</surname>
          </string-name>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. OASIS:
          <article-title>Web Service Business Process Execution Language (WS-BPEL) 2.0</article-title>
          . http: //docs.oasis-open.
          <source>org/wsbpel/2</source>
          .0/OS/wsbpel-v2.
          <fpage>0</fpage>
          -OS.
          <source>html (April</source>
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Object Management Group:
          <article-title>Business Process Modeling Notation (BPMN) 1.2</article-title>
          . http://www.omg.org/spec/BPMN/1.2/ (
          <year>January 2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cardoso</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arnold</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kochut</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Quality of service for workflows and web service processes</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          <volume>1</volume>
          (
          <issue>3</issue>
          ) (
          <year>April 2004</year>
          )
          <fpage>281</fpage>
          -
          <lpage>308</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hwang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A probabilistic approach to modeling and estimating the QoS of web-services-based workflows</article-title>
          .
          <source>Information Sciences</source>
          <volume>177</volume>
          (
          <issue>23</issue>
          ) (
          <year>2007</year>
          )
          <fpage>5484</fpage>
          -
          <lpage>5503</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>Efficient algorithms for web services selection with end-to-end QoS constraints</article-title>
          .
          <source>ACM Transactions on the Web</source>
          <volume>1</volume>
          (
          <issue>1</issue>
          ) (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Eclipse</given-names>
            <surname>Foundation</surname>
          </string-name>
          :
          <article-title>Eclipse Modeling Framework</article-title>
          . http://eclipse.org/ modeling/emf/ (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Bender</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pemmasani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skiena</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sumazin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Finding least common ancestors in directed acyclic graphs</article-title>
          .
          <source>Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'01)</source>
          (
          <year>2001</year>
          )
          <fpage>845</fpage>
          -
          <lpage>853</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>García-Domínguez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medina-Bulo</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marcos-Bárcena</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Hacia la integración de técnicas de pruebas en metodologías dirigidas por modelos para SOA</article-title>
          . In: Actas de las V Jornadas
          <string-name>
            <surname>Científico-Técnicas en Servicios Web y</surname>
            <given-names>SOA</given-names>
          </string-name>
          , Madrid, España (
          <year>October 2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. de Castro, M.V.:
          <article-title>Aproximación MDA para el desarrollo orientado a servicios de sistemas de información web: del modelo de negocio al modelo de composición de servicios web</article-title>
          .
          <source>PhD thesis</source>
          ,
          <source>Universidad Rey Juan Carlos (March</source>
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>García-Domínguez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Homepage of the SODM+T project</article-title>
          . https://neptuno. uca.es/redmine/projects/sodmt (March
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Kolovos</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paige</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rose</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polack</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>The Epsilon Book</article-title>
          . http://www. eclipse.org/gmt/epsilon (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. eviware.com: Homepage of soapUI. http://www.soapui.org/ (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          : JUnitPerf. http://clarkware.com/software/JUnitPerf.html (
          <year>October 2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Küster</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heßler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Towards transformations from BPMN to heterogeneous systems</article-title>
          . In Mecella,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Yang</surname>
          </string-name>
          , J., eds.: BPM2008 Workshop Proceedings, Milan, Italy (
          <year>September 2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. Object Management Group:
          <article-title>UML Profile for Schedulability, Performance, and Time (SPTP) 1.1</article-title>
          . http://www.omg.org/spec/SPTP/1.1/ (
          <year>January 2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. Object Management Group:
          <article-title>UML Profile for Modeling Quality of Service and Fault Tolerance Characteristics and Mechanisms (QFTP) 1.1</article-title>
          . http://www.omg. org/spec/QFTP/1.1/ (April
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. Object Management Group:
          <article-title>UML Profile for Modeling and Analysis of RealTime Embedded systems (MARTE) 1.0</article-title>
          . http://www.omg.org/spec/MARTE/1.0/ (November
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Silver</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maduko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rabia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Modeling and simulation of quality of service for composite web services</article-title>
          .
          <source>In: Proceedings of 7th World Multiconference on Systemics, Cybernetics</source>
          and Informatics,
          <source>International Institute of Informatics and Systems (November</source>
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Petriu</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>Applying the UML Performance Profile: Graph Grammarbased Derivation of LQN Models from UML Specifications</article-title>
          .
          <source>In: Proceedings of the 12th International Conference on Computer Performance Evaluation: Modelling Techniques and Tools (TOOLS 2002). Volume 2324 of Lecture Notes in Computer Science</source>
          . Springer Berlin, London, UK (
          <year>2002</year>
          )
          <fpage>159</fpage>
          -
          <lpage>177</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>López-Grao</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Merseguer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Campos</surname>
          </string-name>
          , J.:
          <article-title>From UML activity diagrams to Stochastic Petri nets: application to software performance engineering</article-title>
          .
          <source>SIGSOFT Softw. Eng. Notes</source>
          <volume>29</volume>
          (
          <issue>1</issue>
          ) (
          <year>2004</year>
          )
          <fpage>25</fpage>
          -
          <lpage>36</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Tribastone</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilmore</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Automatic extraction of PEPA performance models from UML activity diagrams annotated with the MARTE profile</article-title>
          .
          <source>In: Proceedings of the 7th International Workshop on Software and Performance</source>
          , Princeton, NJ, USA, ACM (
          <year>2008</year>
          )
          <fpage>67</fpage>
          -
          <lpage>78</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arsanjani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allam</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SOMA: a method for developing serviceoriented solutions</article-title>
          .
          <source>IBM Systems Journal</source>
          <volume>47</volume>
          (
          <issue>3</issue>
          ) (
          <year>2008</year>
          )
          <fpage>377</fpage>
          -
          <lpage>396</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>