Performance Modeling of Distributed Collaboration Services with Independent Inputs/Outputs Toqeer Israr, Gregor v Bochmann Department of Electrical Engineering and Computer Science University of Ottawa 800 King Edward Ave. Ottawa Ontario K1N 6N5 Canada {tisra051, bochmann}@eecs.uottawa.ca Abstract. This paper deals with modeling and performance analysis of dis- tributed applications, service compositions and workflow systems. From the functional perspective, the distributed application is modeled as an activity in- volving several roles, where behavior is defined in terms of compositions from several sub-activities using the standard sequencing operators found in UML Activity Diagrams. Each activity is characterized by a certain number of input and output events, and the performance of the activity is defined by the mini- mum delays that apply for a given output event in respect to each input event. We use a partial order to model these events, whose delays can be measured through testing. We also provide general formulas to calculate the performance of a composite activity from the performance of its constituent sub-activities and the control structure specifying the order of execution. Proofs of correct- ness for these formulas, along with a simple example are also given. Keywords: software performance, modeling, partial order, collaborations, UML Activity Diagrams, distributed applications, web services 1 Introduction Many commercial systems rely on multiple communicating components for parts of their business processes. These are often structured as distributed systems, with components running on different processors or in different processes. For example, a multi-tiered system might start with requests from Web clients that invoke a process in a Web Application Server, which in turn makes calls to some “third party” servers – components involved in this system would be the client, Web-Application Server and the “third party” servers. When developing such distributed reactive systems, there is a need to analyze per- formance of both from a global system perspective and a local or component-wise perspective. The global perspective specifies and analyzes the collaborative behavior of a distributed system in an abstract manner, while the local perspective identifies different system components along with their behaviour such that their interactions give rise to a behaviour satisfying the global perspective. Numerous methodologies have been employed for analyzing such systems, such as Queuing Models, Stochastic Proceedings of NiM-ALP 2013 16 Petri Nets (SPN), Performance Evaluation and Review Technique (PERT) [4], and UML Profile: Modeling and Analysis of Real Time Embedded Systems (MARTE). Most of these notations, though, assume the basic activities in the decomposition to be allocated to a single system component. However, most of the applications have ac- tivities which are modeling collaborations between several system components, for instance an interaction between a client and a server. Such distributed reactive systems are typically accompanied by performance specifications which are generally formalized by legal Service Level Agreements carrying financial penalties for non-compliance. Hence, it is becoming a commercial imperative to ensure that the participants in a workflow meet certain time criteria under all possible scenarios. McNeile in [6] modeled and analyzed end-to-end work- flow delays but their analysis assumed that all the components are available at the beginning of the workflow, yielding a single workflow delay. This is not realistic as components become available at different time. This leads to relationships and delays between outputs and inputs of various components. To this end, we introduced Partially Ordered Specification (POS) in [5], a model- ing paradigm composed of UML Activity diagrams [7] and a partially ordered set of inputs and outputs. In this paper, we revise and extend the existing work done in [5] by considering performance characteristics of composite activities with independent inputs and outputs. We analyze and derive formulas for composite activities, com- posed of sub-activities with concurrency and alternatives. We start off by reviewing modeling with partial orders. In Section 3, we review some of our previous work from [5] to describe POS and performance characteristics of basic operators. In Section 4, we propose formulas and provide proofs for calcu- lating the performance of composite collaborations, composed with alternate and concurrency operators for independent input and output events. 2 Modeling Distributed Collaboration Services 2.1 Modeling Events with Partial Orders In [5], we adapted the modeling methodology from [4] to model inputs and out- puts of UML Activities (Collaborations) [7] as partially ordered events. We extend the previous work done in [5] by modeling events in sequence, alternatives and in a concurrent manner. Modeling Events in Sequence with Partial Orders: A set of events may have a dependency on other events, and hence cannot occur until all the required events have occurred. Such is shown in Figure 1.0a, where event e4, is shown to be dependent on e1 and e3, while e2 is only dependent on e1. Modeling Alternatives with Partial Orders: As partial ordering does not allow modeling of alternative paths, inspired by the choice symbol in UCM [2], we intro- duce a new symbol in partial ordering to represent a choice in a behavior. As can be seen in Fig 1b, a UML “decisionNode” is modeled as a rectangular box with one in- coming edge and multiple outgoing edges. The UML “mergeNode” is modeled as a similar rectangular box with multiple incoming edges and one outgoing edge. Modeling Concurrency with Partial Orders: As illustrated in Fig 1c, there is a Proceedings of NiM-ALP 2013 17 single event e1 which leads to two events e2 and e3. Both events e2 and e3 must occur after e1, but they are incomparable between one another. e1 e1 e3 e1 alternative e2 e3 e2 e4 e2 e3 merge e4 a) Sequential b) Alternative c) Concurrency Fig. 1. Modeling the Ordering of Events Using Various Operators 2.2 Describing Collaborations with Partially Ordered Specifications Partially Ordered Specifications (POS) rely on UML Activity Diagrams (AD) to capture the dynamic behaviour of a system. This is comprised of actions and se- quencing operators such as sequence, alternative, concurrency, and loops to define the relationship between these actions. R1 R2 input For a given activity, we consider input events i1 i2 indirect and output events. Input (/output) events, dependency direct shown as unfilled circles in Fig 2, are dependency output events which mark the beginning (/ending) o1 o2 events of the execution of actions by a specific Fig. 2. Partial Order Events role in a given activity. Note that a given activity has an input and an output event for each involved role. These events form a partially ordered set, where a causal relationship may exist between some of these events, shown by arcs “”. The output events are not ordered relative to one another directly but each output event has a dependency on some cor- responding input events. Figure 2 illustrates an activity, with input events i1 and i2 and output events o1 and o2. As i1 and o1 are input and output events of the same role R1, o1 must occur after i1 due to local sequencing. Due to the relationship i1  i2 and i2  o2, there is an indi- rect dependency from i1 to o2, shown by the dashed arrow “-->.” Output events o1 and o2 are incomparable and may occur in any order. 2.3 Delay for a Given Activity We devised an approach to determine the dependencies amongst the input and out- put events within a given sub-activity in [5]. For a given sub-activity, according to this approach, one measures the delay between the time instance of the occurrence of input event i and a dependent output event o, provided all the other events on which o depends have occurred long time before. This delay is called Nominal Execution Time Delay (NETD), written as Δio. This leads to the following formula which yields the performance of a collabora- tion D based on dependencies between input and dependent output events: i DTo = maxiεI(D) (DTi + DΔ o) (1) Proceedings of NiM-ALP 2013 18 where To is the time of the output event o, Ti is the time of the input event i, Δio is the NETD from input event i to the dependent output event o and I(D) is the set of input events. Subscript “D” indicates all the notations are for the abstract activity D. 3 Performance Characteristics of Composite Activities So far this paper provides a somewhat summarized version of modeling and per- formance analysis of composite activities given in [5]. The remainder of this paper is new material analyzing performance of activities composed of alternative and concur- rent sub-activities with independent input and output events. 3.1 Independent Events Formula (1) was derived to calculate the time of the output events based on the time of the input event and the NETD that exists between them, provided the output events depended on input events. We can relax this condition by revising the defini- tion of (1) to include all events, by stating that the NETD between an input event i i and a non-dependent output event o’ is: AΔ o’ = –∞ (2) Then the formula in (1) is not limited to dependent events, but rather is valid for all involved events – dependent and independent events alike. Note, for an input and output event of the same role, the output event cannot occur until its input event oc- curs (also known as local sequencing). Hence for a given output event, at least one input event (input event of the same role) will always exist which will make (1) yield a positive value. 3.2 Consideration of Control Flow Paths We define control flow paths (cp) to depict a single execution of a given system. As an activity may have alternatives, interruptions and loops, this causes multiple control flow paths to exist. It is clear that different control flow paths, in general, lead to different execution time delays. For instance, the different branches of an alterna- tive may result in different delays. In general, the number of different control flow paths is unlimited. For instance, the number of times a while loop executes may be unbounded, and/or the body of a loop or alternative may include other loops or alter- natives, resulting in a recursive structure. Therefore the Nominal Execution Time Delay (NETD) defined above depends on a particular control flow path that was followed during the execution of the collabora- tion. Throughout this paper, we only consider a single particular path, say cp. Then we write (cp)AΔio, for the Nominal Execution Time Delay between output event o and input event i for the control flow path cp of collaboration A. We recognize the delays can be of a stochastic nature. However, for our work, we consider only fixed delays while considering a single control flow path. Shared Resources: We have assumed that the NETD will actually be attained during a control flow path, which may not be realistic if shared resources are involved in the processing of several inputs on which a single output depends. Hence we assume in Proceedings of NiM-ALP 2013 19 the following that there are no shared resources and each role or all concurrent activi- ties of a given role are implemented by an independent processor. 3.3 Basic Performance Characteristics Events can be combined using various operators such as sequences, alternatives, concurrency, loops etc. We can define the basic performance characteristics involv- ing partially ordered events. Sequence: Single Event (SE) Dependent on a Single Event (SE). Figure 3a shows three events in a sequence where the dependency e1 e2  e3 exists. If the delays between these events are considered to be fixed delays, then it is quite clear that the delay from e1 to e3 (Δe1e3) is the sum of the delays from e1 to e2 (Δe1e2) and e2 to e3 (Δe2e3): Δe1e3 = Δe1e2 + Δe2e3 (3) Concurrency: Multiple Events (ME) Dependent on a Single Event (SE). If there are multiple events dependent on a single event such as shown in Figure 3b, we exam- ine two sets of delays – earliest event y and latest event z amongst e1...en. The earliest (/latest) event amongst e1...en, is an event for which there are no other events which precedes (/succeeds) this event amongst e1...en. The delay to these events from event e0 can be calculated by considering the event y (/z) amongst all the events, which has the minimum (/maximum) delay from the input i: Earliest event: Δiy = minoε{e1,e2..en} (Δio) (4) Latest event: Δiz= maxoε{e1,e2..en} (Δio) (5) e0 e0 e0 e1 e2 ... en e1 p1 pn p2 e1 e2 ... en e2 e0 e1 e2 ... en a. SE  SE b) SE  ME c) ME  SE d) SE AE Fig. 3. Various Dependencies Merge: Single Event (SE) Dependent on Simultaneous Multiple Events (ME). Figure 3c shows a single event eo dependent on multiple events e1...en. Event e0 can only occur when all of its dependencies are satisfied i.e. events e1...en, all, have oc- curred. If we assume that all the events e1...en occur at the same time, then we can calculate the delay for e0 to occur by taking the maximum of all the individual NETDs: Δio = maxjε{e1,e2..en} (Δjo) (6) Alternative Events (AE) Dependent on a Single Event (SE). Figure 3d shows an alternative sequence, where the event e1...en occur with probability p1...pn, respec- tively, each having a different control flow path. However, if we consider the control flow path where a specific event em occurs, where emε{e1,..,en}, then the time delay between event i and event em, is the measured/known NETD: Δim (7) Proceedings of NiM-ALP 2013 20 4 Deriving General Formulas for Sequencing Operators Notation: The set of roles involved in activity X is denoted by R(X). The set of input and output events in activity X are denoted by I(X) and O(X) respectively. We use activity D to abstract the sequence of sub-activities A and B. Hence, the set of roles involved in activity D consists of all the roles involved in sub-activities A and B, that is, R(D) = R(A) ∪ R(B). The set of roles common to both collaboration A and B is denoted by RC(D), where RC(D) = R(A) ∩ R(B). The non-common roles of A are the set of roles involved only in sub-activity A and not in sub-activity B, denoted by RNC(A) = R(A) – RC(A). A similar is used for output roles. In [5], we proposed (and provided proofs for) definitions of Δio, for strong and weak sequencing operators for input and dependent output events. In the following, we extend this work by analyzing performance of sub-activity A and B using concur- rency and alternative operators with independent input and output events. These compositions are abstracted by activity D. 4.1 Concurrency Figure 4 shows concurrent execution of sub-activities A and B and the partial order equivalent. For the non-common roles, the roles become available for their next ac- tivity, as soon as execution is completed in the respective sub-activities. For the common roles, execution in both of the sub-activities must be completed before the role is available for its next activity, as shown in Figure 4b. NETD: The NETD for a composite activity D consisting of concurrent execution of sub-activities A and B, (cp)DΔwz, is given by the following expressions: Concurrency Table 1. Fixed Delays for Concurrency Case Condition Fixed Delays (1) (w ε IC(D) and z ε OC(D)) max((cp)AΔwz, (cp)BΔwz) (8) A B (2) (w ε INC(A) and z ε O(A)) or (cp)AΔwz (9) NC (w ε I(A) and z ε O (A)) (3) (w ε INC(B) and z ε O(B)) or (cp)BΔwz (10) a (w ε I(B) and z ε ONC(B)) Fig 4a. Concurrency (4) (w ε INC(A) and z ε ONC(B)) or –∞ (11) Control Flow (w ε INC(B) and z ε ONC(A)) Proof: We consider the proofs for all cases separately: Case (1): As mentioned and illustrated in Figure 4b, the common role must complete its execution in both sub-activities before any subsequent executions can be per- formed by that role. Since all the processes of a common role become available at the same time for both sub-activities, the time of the input is the same in both sub- activities and hence from (6) we know that the NETD is the maximum of two delays. Case (2): If the input is a non-common role of A, then the output is any role of A. As can be seen in Figure 4b, except for the delay from the input event to the output event of the common roles, the NETD is due only to the delay from the dependency input Proceedings of NiM-ALP 2013 21 event w to output event z. There is no POS Equivalent rA r’A rC r’C rB r’B other dependency which needs to be satis- fied for event z to occur. Hence, the NETD D for this case is NETD from w to z, AΔwz. iA iA’ Similarly for a non-common output. Case (3): Proof for this scenario is similar w AΔ x(t) A to the proof of Case (2). oA oA’ Case (4): As illustrated in Figure 4b, there iB iB’ w DΔ z (t) is no dependency from an input of a non- B y BΔ z (t) common role of activity A to an output oB oB’ event of a non-common role of activity B or vice-versa. As defined by (2), the NETD for events with no dependency is –∞. Fig 4b – POS of Concurrency 4.2 Alternatives Figure 5 shows an alternative operator and its POS equivalent between two sub- activities, A and B. We assume the choice, made by role c*, is done instantaneously and therefore does not directly add any delays. However, no action of the sub- activities in the body of an alternative may start to execute until this choice is made. Hence, this causes a dependency between c* and all the other roles involved in the sub-activities of the alternative body, shown by the introduction of sub-activity Choice in Figure 5b. Note: this does not have any impact on NETD of sub-activities. As discussed in section 3.3, there is a control flow path for each branch of an al- ternative, leading to different execution time delays. NETD: The NETD for a composite activity D consisting of an alternate execution of sub-activities A and B, (cp)DΔwz, is given by the following table: Table 2. Fixed Delays for Alternatives Case Condition Fixed Delays (5) (w ε IC(D) and z ε OC(D)) (ep) w AΔ z if A is executed (ep) w BΔ z otherwise A B NC w (6) (w ε I (A) and z ε O(A)) or (ep) AΔ z if A is executed (w ε I(A) and z ε ONC(A)) 0 if B is executed and w = z –∞ if B is executed and w ≠ z (7) (w ε INC(B) and z ε O(B)) or 0 if A is executed and w = z Fig5a. (w ε I(B) and z ε ONC(B)) –∞ if A is executed and w ≠ z Alternative w BΔ z otherwise (ep) Control Flow (8) (w ε INC(A) and z ε ONC(B) or –∞ if A or B is executed NC NC (w ε I (B) and z ε O (A)) Case 5: Either sub-activity A will execute or B will. From (7) we know that depend- ing on the sub-activity being executed, the NETD of the composite activity D will be that of the sub-activity being executed. Case 6: If activity A executes then the delay is that of the execution in activity A. But if activity B executes, then when w = z, there is a dependency between the events due to local sequencing but there is no delay as there is no execution performed be- Proceedings of NiM-ALP 2013 22 tween these events. Hence the delay is 0. When w ≠ z, then there is no local se- quencing dependency between the events and no delay. Hence, the delay is –∞. Case 7: This is similar to Case 6. rA r’A rC rC* r’C rB r’B Case 8: This is similar to Case 4. Choice 5 Conclusion We analyzed performance of global collaborations composed from sub- iA iA’ w collaborations with sequential, alter- AΔwx A DΔ z native and/or concurrent ordering, oA oA’ based on the delays of the constituent iB iB’ sub-activities. We believe that this B y BΔ z approach to performance modeling of oB oB’ distributed systems is useful in many fields of application, including dis- tributed work flow management sys- tems, service composition for com- oC* munication services, e-commerce Figure 5b –POS of an Alternative applications, and Web Services. We have implemented a tool that takes as input an Activity Diagram including sub-activities with defined performance characteristics and provides as output the NETDs of the global collaboration. In this paper, we have considered fixed delays. We plan on extending our work to consider different types of delays (stochastic and range of delays). We also plan to extend the here described work to include additional sequencing operators, such as strong and weak while loops [1] and interruptions. References 1. Bochmann, G.V. Deriving component designs from global requirements, in: Proceedings on International Workshop on Model Based Architecting and Construction of Embedded Systems (ACES), Toulouse, 2008, pp 55-69 2. Buhr, R.J.A., Casselman, R.S., Use CASE Maps for Object-Oriented Systems. Prentice Hall, 1996. 3. Castejón, H.N, Bræk, R., and Bochmann, G. v., On the realizability of collaborative ser- vices. Journal of Software and Systems Modeling, Vol. 10 (12 October 2011), pp. 1-21 4. Chinneck, J., Practical Optimization: A Gentle Introduction, online textbook, see http://www.sce.carleton.ca/faculty/chinneck/po.html. 5. Israr, Bochmann, Performance Modeling of Distributed Collaboration Services, Proceed- ings of the 2nd ACM/SPEC International Conference on Performance Engineering, Karlsruhe, Germany, 2011, pp 475-480 6. McNeile, A., “Using Motivation and Choreography to model Distributed Workflow”, Pro- ceedings of the 5th ACM SIGCHI Annual International Workshop on Behaviour Model- ling - Foundations and Applications, Montpellier, France, 2013 7. Object Management Group. Unified Modeling Language Superstructure, V2.1.2. OMG Available Specification, November 2007. OMG document number: ptc/2007-11-02. Proceedings of NiM-ALP 2013 23