Introduction

Stochastic Performance Analysis of Distributed Activities

Toqeer Israr

Gregor v. Bochmann

bochmann@eecs.uottawa.ca 0 0 Department of Electrical Engineering and Computer Science University of Ottawa , 800 King Edward Ave. Ottawa Ontario K1N 6N5 Canada

2013

24 31

This paper analyzes stochastic performance of a distributed global activity, composed of sub-activities sequenced serially, probabilistically, or concurrently. We provide general formulas with which we calculate the performance of a composite activity based on the performance of the constituent sub-activities and the control structure. To do this, we model each sub-activity as a Partially Ordered Specification (POS), where each sub-activity is characterized by independent input events, dependent output events and the stochastic minimum delays between these events. This technique allows two or more subactivities to be combined hierarchically. Proofs of correctness for these formulas are given and a simple example is discussed throughout the paper.

software performance stochastic modeling partial order collaborations UML Activity Diagrams distributed services web services

Introduction To satisfy these needs to model activities, Bochmann et al. in [1] and [2] have proposed a new modeling paradigm based on UML Activities and their orderings whilst we represented this model in [3] as a partially ordered set of inputs and outputs, called Partially Ordered Specification. While the aforementioned paradigms model the activities and analyze the correct

ness of the required communication protocols in [ 1 ] and [ 2 ], we analyzed the performance of such activities in [ 3 ]. Using partial orders, we introduced a Partially

Ordered Specification (POS) to model the temporal relationships among the sub

activities within a given activity. Inspired by Performance Evaluation and Review

Technique (PERT), with POS, we calculate the fixed completion time for each component (actor) of a global activity based on fixed performance characteristics of the sub-activities. In this paper, we extend this work by assuming that delays are distributions, and analyze the performance of a global scenario based on the stochastic performance properties of the constituent sub-activities. We start off by reviewing the composition rules for stochastic distributions in Sec

tion 2. In Section 3, we discuss the modeling paradigm based on activities as well as

Partial Order Systems. We also describe the rules of strong and weak sequencing as well as provide performance analysis for fixed delays from [3]. In Section 4, we propose formulas and provide proofs for calculating the performance of composite activities with distributions.

Composition of Distributions Graph-based models are common modeling paradigms to represent system behav

iours as UML Activities. An Activity is comprised of actions (nodes) and sequencing operators (edges) such as sequence, alternative, concurrency, and loops to define the relationship between these actions, as illustrated in Fig 1. These activities may have constant or distribution time delays.

We analyzed in [ 3 ] the performance of a global activity, based on fixed time delays of constituent sub-activities. However, when activity delays are statistically varying and characterized by a time distribution, the analysis of a graph model can be quiet challenging. If an event-precedence graph with random activity times is in series/parallel/alternative in nature, the distribution function of the completion of the graph can be calculated by combining the distribution functions of the individual activities using multiplication and convolution [ 6 ]. However, the following initial assumptions are made:  Parallel activities do not need to have identical distributions.  Infinite resources are available for the activities and hence there are no resource contention issues.

We assume the duration of an activity, Ai, has an associated delay distribution

(probability density function) of fi(x). The cumulative distributive function (CDF) for the delay of activity Ai is then defined by: Fi(t) = i(x) dx

The function Fi(t), associated with each activity Ai, represents the probability that that activity Ai finishes by time t. While each sub-activity is assigned a cumulative

distribution function (CDF) for the completion time of the activity it models, these sub-activities may be composed to form a composite activity with a resultant CDF for the completion time of the entire set of considered activities.

We consider activities composed with the following 3 operators: series, alternative and concurrency. We assume all activities start with a single activity, called an initial activity. Activities are in series when a single activity succeeds the initial activity such as shown in Figure 1a. Alternate activities may exist when a single activity may execute amongst multiple activities such as shown in Figure 1b. Concurrent activities exist when multiple activities may execute simultaneously succeeding the initial activity.

Various compositions of these sub-activities A1...An, each activity Ai with independent CDF Fi(t) can be abstracted by a global activity G with a CDF, FG(t).

F1(t) F2(t) A1 A2 2.1

Series

p1 p2 pn A1 F1(t) A2 F2(t) ...

An Fn(t) S0 A0

a) series b) alternative c) concurrency

A1 F1(t) A2 F2(t) ...

An Fn(t) (1) (2) (3)

If any two sub-activities, Aj and Ak, with CDF of Fj(t) and Fk(t) are combined in series

such as shown in Figure 1a, the CDF of the combined activities is [ 6 ]: where ⊗ represents convolution, defined as:

FSUM(t) = Fj(t) ⊗ Fk(t) Fj(t) ⊗ Fk(t) =

Fk( t – x ) dFj(x) 2.2

Alternatives If there is a probability pi for the execution of activity Ai, such as in Figure 1b, the

model represents a scenario with multiple possible paths. Then the distribution for the global activity G is [ 6 ]:

FG(t) = ∑ i(t) Fi(t) 2.3

Concurrency If the sub-activities in the global activity G execute concurrently, such as in Figure 1c, then the following two cases can be considered. Suppose there is a parallel search for an item in a distributed database, where the

earliest concurrent search to finish, will terminate the overall search. We would be interested in the time it would take for the earliest search to finish. A scenario with the earliest or “minimum” CDF of a global activity G composed of parallel subactivities Ai, can be modeled by [ 6 ]: minFG(t)= 1 – ∏ 1 Fi(t)) (4)

Now suppose there are parallel sub-activities, Ai, composing the global activity G

again, but the completion of this global activity G requires all of the sub-activities to complete. This can be accomplished by calculating the delay of the activity with the maximum time delay by [ 6 ]: maxFG(t)= ∏ Fi(t) (5)

Note: For the activity distributions in (4) and (5), the Fi(t) need not to be the same for

different i. 3

Modeling Distributed Activities Based on [2], we introduced in [3] a Partially Ordered Specification (POS), which

allows modeling of a UML activity with a partial ordered set of input and output events, as shown in Figure 2. This modeling paradigm shows the dependencies between various events and is used in analyzing performance of activities with various operators.

For a given activity, we suggested tavhnoedlvaienndpeurontldeainntogd betveheemntoo[du3et]pl.eudtTbhoyefsaeeasetcvahretniintnsg-, ienvietinatting iR11 Ri22 ienvpeuntt belonging to various components, form direct indirect a partially ordered set, where a causal dependency dependency relationship may exists between some o o of these events, shown by arcs “” in 1terminating events 2 the figure. The ending Figure 2 – Activity represent as a POS events are not ordered relative to one another, but each ending event has a dependency on its corresponding starting event (local sequencing). An initiating event, belonging to an initiating role (represented by dark circles “●”), is a specialized starting event in this partially ordered set of events, for which there are no other events in that set which precedes that event. Similar holds for the terminating events corresponding to terminating roles.

Figure 2 illustrates an activity, with input events i1 and i2 and output events o1 and o2. As i1 and o1 are input and output events of the same role R1, o1 must occur after i1 due to local sequencing. Furthermore, since i1 is the only initiating event, all events in the activity, including, i2, must occur after i1. Due to the relationship i1  i2 and i2  o2, there is an indirect dependency from i1 to o2, shown by the dashed arrow “-->.”

Terminating events o1 and o2 are not ordered and may occur in parallel.

3.1

General Formulas for Standard Sequencing Operators with Fixed Delays

We introduced Nominal Execution Time Delay (NETD) written as Δixom, between input ix and output om, where NETD is the the delay between the time instance of the occurrence of input event ix and the occurrence of a dependent output event om, provided all the other events on which om depends have occurred long time ago [ 3 ].

Based on NETD, we derived the following general performance formula which can

then be applied to sequencing operators of sub-activities to yield the performance metrics of a global activity:

tom = max x (tix + Δixom) (6) where tom is the time of the output event om, tix is the time of the input event ix on which om depends, and Δixom is the NETD from input event ix to the output event om.

We assumed shared resources are not involved in these activities. Furthermore, we assumed that there is a dependency, and hence a delay, from each input to each output of an activity. An activity may be comprised of sub-activities sequenced with strong or weak se

quencing, parallel operators, alternatives and interruptions. In this paper, we limit our scope to strong and weak sequencing.

Strong sequencing, sometimes called global sequencing, between two activities A1 and A2 means that all sub-activities of A1 must be completed before any sub-activity of A2 may start. In contrast, weak sequencing between A1 and A2 (only) means that each system component locally applies sequencing to the local sub-activities of A1 and A2, that is, a component may start with sub-activities that belong to A2 as soon as it has completed all its local sub-activities that are part of A1. Strong sequencing implies weak sequencing, but not inversely. In particular, if a component is not involved in A1, it may start with sub-activities of A2 even before A1 begins its execution.

A strong sequence is modeled in Figure 3.0 between two sub-activities, A and B, and as discussed, all the initiating events in activity B may occur, only if all the terminating events of the previous activity, activity A, have occurred. This is represented by a Final Action event - when all the terminating events have occurred, denoted by AOF (Final Output of sub-activity A). The time of all the initiating events for the next activity, activity B, is the time of the Final Action event of the previous activity, activity A.

Strong Sequence

POS Equivalent A s B

AI1 AO1 BI1

BO1

A B

AIk

AOk' AOF

BIh BOh' ΔAIxAOy

ΔAIxAOF ΔBIsBOm

We also proposed and proved for two sub-activities, sub-activity A with k inputs

and k’ outputs and sub-activity B with h inputs and h’ outputs, that are strongly sequenced, with known NETD for each sub-activity (ΔAIxAOy and ΔBIsBOm), that the NETD for the composite activity (ΔAIxBOm) is given by the formula: Weak Sequence

A B w

POS Equivalent AIc

AIc’

AIa

AIa’ AOc AIc’

AOa’ BIc

BIc’

BIb BIb’ BOc

BOc' BOb

BOb’ A AOa B

FAIxAOy FBIgBOm (7) FAIxBOm (8) (9)

Similarly for two activities A and B weakly sequenced which have roles c..c’ com

mon to both A and B, it was shown that the NETD for the composition is: ΔAIxBOm = maxs=c..c’ (ΔAIxAOs + ΔBisBOm ) 4

General Formulas for Standard Sequencing Operators with Delay Distributions So far we somewhat summarized the modeling and performance analysis of composite activities as described in [3]. The remainder of this paper is new material. In the following, we analyze the performance of activities composed of strong and weak sequences where the NETDs of activities are characterized by time distributions.

4.1

Strong Sequence Proposition: If two sub-activities, sub-activity A with k inputs and k’ outputs and sub-activity B

with h inputs and h’ outputs, are strongly sequenced and the NETD for each subactivity, ΔAixAOy and ΔBIsBOm, are known and characterized by cumulative distributive functions (CDF) FAixAOy(t) and FBIsBOm(t), respectively, then the CDF, FAIxBOm(t), of the NETD for the composite activity is:

Using (5), this formulas leads to:

ΔAIxAOF = maxy=1..k’(ΔAIxAOy) FAIxAOF(t) = ∏ F

AixAOy(t) ΔBIjBOm = maxs=1..h(ΔBIsBOm) FBIjBOm(t) = ∏ FBIsBOm(t)

Since the maximum CDF is required of all the delays involved, we use (5) and obtain:

The NETD for activity B for an input s to an output m is ΔBIsBOm. To calculate the maximum delay to produce the output m, the maximum is taken over all the given inputs of activity B: POS. We assume FAixAOy(t) and FBIsBOm(t) are known, either given or measured using the testing methodology described in [ 3 ].

The delay from the Final Action event (AOF) relative to the input event x of A, is the maximum of all the paths’ delays from event x to event AOF: (10) (11) (12) (13) (14) (15)

The delays for activity A and B are represented by (11) and (13), respectively. Since

activity A and activity B are in sequence, using (1), the total time delay FAIxBOM is the convolution of FAIxAOF(t) and FBIjBOm(t): = ∏ FAIxBOM(t)= FAIxAOF(t) ⊗ FBIjBOm(t)

FAixAOy(t) ⊗ ∏ FBIsBOm(t) 4.2

Weak Sequence Weak sequencing between two activities A and B is illustrated in Figure 3. It is assumed that roles c…c’ are participating in both activities A and B, while roles a…a’ participate only in activity A and not in B and roles b…b’ participate only in activity B and not in A. To calculate the NETD between any two events, we assume all the remaining in

put events have occurred long time ago and any executions precipitated by these input events would have also completed long time ago as discussed in the testing methodology of [ 3 ]. Roles a…a’ are involved only in activity A and no dependency exist from activity B to the output of roles a…a’. Hence, the NETD for roles a…a’ is that of activity A, FAIxAOy. Similar is true for activity B.

We are interested in the NETD between the output events of activity B relative to the input events of activity A, given the NETDs are delay distributions. Proposition: If two activities, A and B, are weakly sequenced then the NETD between the out

put events of activity B relative to the input events of activity A is:

FAIxBOm(t) = ∏´ FAIxAOs(t) ⊗ FBisBOm(t)) (16)

Proof:

The NETD of activity A and B is ΔAIxAOy and ΔBigBOm, respectively, which have the CDFs FAIxAOy(t) and FBIgBOm(t), respectively.

From the right side of Figure 3, it’s evident that the NETD of the composed activ

ity is the maximum of the NETD of both activities A and B in series with the common role s, i.e. s being the ending role of activity A and also the starting role of activity B.

Activity A, with input event x in series with activity B with output event m, is

represented by ΔAIxBOm with the CDF FAIxBOm(t). As two activities are in series, equation (1) can be used to calculate the NETD for a single s: And to take the maximum over the inputs s=c…c’, we obtain:

AIxBOm(t) = FAIxAOs(t) ⊗ FBisBOm(t)

In this paper, we have considered the delay distributions of composite activities

sequenced with strong and weak sequencing. Similar to analysis done in [ 4 ] for fixed delays, we plan to consider other composition operators such as parallel, alternatives, loops, etc for delay distributions. We have implemented a tool that takes as input an

Activity Diagram including sub-activities with defined performance characteristics and provides as output the NETDs of the global collaboration for fixed delays. We plan on extending this tool to support delay distributions. We believe that this approach to performance modeling of distributed systems is useful in many fields of application, including distributed work flow management systems, service composition for communication services, e-commerce applications, and Web Services.

1. Bochmann , G.V. Deriving component designs from global requirements , in: Proceedings on International Workshop on Model Based Architecting and Construction of Embedded Systems (ACES) , Toulouse, 2008 , pp 55 - 69

2. Castejón , H.N , Braek , R. , and Bochmann , G. v., “ Realizability of Activity-based Service Specifications” . Journal of Software and Systems Modeling(to be published), 2011

3. Israr , Bochmann, Performance Modeling of Distributed Activity Services, Proceedings of the 2nd ACM/SPEC International Conference on Performance Engineering , Karlsruhe, Germany, 2011 , pp 475 - 480

4. Israr , Bochmann, Performance Modeling of Distributed Collaboration Services with Independent Inputs/Outputs, Proceedings of 5th International Workshop on Non-functional Properties in Modeling: Analysis, Languages, Processes , Miami, USA, 2013

5. OMG, Unified

Modeling Language

(UML), Version 2.1 .1, February

2007

R.A.

Sahner and

K.S.

Trivedi , Performance and reliability analysis using directed acyclic graphs . IEEE Trans. Software Eng ., ( 1987 ), pp. 1105 - 1114 .