148 Towards Hyperscale Process Management Kevin Andrews1 , Sebastian Steinau1 and Manfred Reichert1 Abstract: Scalability of software systems has been a research topic for many years and is as relevant as ever with the dramatic increases in digitization of business operations and data. This relevance also applies to process management systems, most of which are currently incapable of scaling horizontally, i.e., over multiple servers. This paper discusses an approach towards hyperscale workflows, using a data-centric process engine to encapsulate data and process logic into objects, which can then be stored and concurrently manipulated independently from each other. As this allows for more concurrent operations, even within a single data-intensive process instance, we want to prove that an implementation of a hyperscale process engine is a feasible endeavor. Keywords: Scalability, Process Management Technology, Object-centric Processes 1 Introduction For decades, researchers have been examining parallelism and scalability in computer hard- and software. The topic of scalability also became instantly relevant to workflow management systems (WfMS) when they first showed up on the market, as they were built explicitly with large-scale applications in mind. First attempts to create scalable workflow management systems applied existing scalable architecture principles to WfMS. The resulting approaches, such as WIDE [CGS97], OSIRIS [Sc04], and Gridflow [Ca03], strongly focused on the system architecture point of view, largely ignoring other aspects, such as role assignments, permissions, and data flow. However, the process models these approaches, especially Gridflow, are meant to support, are typically high-performance compute workflows, where these aspects merely play a secondary role. Due to these limitations, as well as further advances in processing power over the years, most modern WfMS and process engines for human-centric “business” processes do not offer horizontal scalability. This is likely acceptable with modern computing power in most company intranet scenarios. According to Amdahl’s law of scalability [Am67], which assumes that more processing power and more processors equal lower turnaround times for a fixed workload, this would mean that there are no more, or only negligible, scalability problems in WfMS. However, Gustafson’s law [Gu88], a reevaluation of Amdahl’s law, states that the workload scales upwards with the increases in resources and performance, meaning that the time needed to execute a workload stays virtually the same with more computational power. This effect can be seen in the trend towards cloud-based software, including cloud-based process engines. The workload placed on a cloud-based process engine is higher than the one placed on an on-premise WfMS ten years ago, just as the hardware capabilities of a server hosting such a cloud-based process engine are higher. 1 Institute of Databases and Information Systems, Ulm University, firstname.lastname@uni-ulm.de Towards Hyperscale Process Management 149 In summary, this means that improving process engine scalability can offer benefits to performance and cost reduction in data-centers. These advancements can help pave the way for better market penetration of cloud-based process engines and thereby also process management technology as a whole, even in smaller companies. To facilitate this, we present the idea of replacing the typical activity-centric conceptual basis with a data-centric one. We intend to show that, besides their many other benefits [KWR11], data-centric approaches are better suited for highly concurrent work with processes that are data-heavy and involve large amounts of user interaction. Section 2 explains some fundamentals of the data-centric process management approach that we intend to use as the basis for our research. In Section 3, we compare the scalability possibilities of activity-centric process models to our vision of parallelization with data-centric models. Finally, Section 4 gives an outlook, based on the findings presented in this paper. 2 Fundamentals This section gives a short overview of PHILharmonicFlows [KR11, KWR11], an object- aware approach to business process management. The PHILharmonicFlows approach was selected as a basis for our research on scalability as its core idea is to group data and associated processes into objects. These objects are largely independent units and can be interacted with individually. One such object exists for each business object present in a real-world business process. As can be seen in Fig. 1, a PHILharmonicFlows object consists of data, in the form of attributes, and a process model describing the object lifecycle. Vacation Request Approved Initialized Decision Pending Approved From Until Comment Approved == true Rejected Approved == false Assignment: Employee Assignment: Manager Lifecycle Attributes From: Date Until: Date Approved: Bool Comment: String Fig. 1: Example PHILharmonicFlows Object The attributes encapsulated in the Vacation Request object are From, Until, Approval, and Comment. The lifecycle process describes the different states (Initialized, Decision Pending, Approved, and Rejected), a Vacation Request may have during process execution. As PHILharmonicFlows is data-driven, the lifecycle process for the Vacation Request can be understood as follows: The initial state of a Vacation Request object is Initialized. Once an Employee has entered data for the From and Until attributes, the state changes to Decision Pending, which allows a Manager to input data for Comment and Approved. Based on the value for Approved, the state of the Vacation Request changes to Approved or Rejected. 150 Kevin Andrews, Sebastian Steinau and Manfred Reichert Based on the current state of an object, it can be coordinated with other objects corresponding to the same business process through a set of constraints, defined in a separate coordination process, details of which are omitted for brevity. A simple example could be a constraint stating that a Vacation Request may only change its state to Approved if there are less than 4 other Vacation Requests already in the Approved state. A complete process consists of many different objects, each describing data and the different states the object may enter. These excerpts from PHILharmonicFlows will be used in the remainder of the paper to show that a data-centric process engine can be used for highly scalable processes. 3 Methodology Our research idea stems from the examination of past attempts at creating highly scalable WfMS. As mentioned in Section 1, most of these approaches [CGS97, Ca03, Sc04] focus on industrial or scientific workflows and not on processes with a large amount of user interaction. The approach presented in [BRD03] proposes partitioning a process model based on role assignments and executing these partitions on different servers so that the network and processing load of a single process instance can be distributed. However, this approach has significant drawbacks in respects to its flexibility, as the partitioning is done at build-time and is limited by the structure of the model. Consider the abstract activity-centric process model shown in Fig. 2. Partition 2 X Y B A E Partition 1 Partition 4 C D Partition 3 Fig. 2: Partitioned Activity-centric Process Model If the process model is partitioned in this way, i.e. according to the role assignments of the different activities, Partitions 2 and 3 can be executed in parallel, increasing scalability. However, a maximum of two tasks can be executed at the same time, limited by the number of branches in the process model. Furthermore, the depicted data-flow introduces copy-operations of data between nodes on control flow splits and joins, as well as other problems, such as lost updates. In summary, the process modeler has additional non-trivial concerns when creating the process model. We propose to alleviate these issues by relying on a data-centric approach for executing the workflows. To illustrate the idea, we use the PHILharmonicFlows concept (cf. Section 2). As shown in Fig. 1, PHILharmonicFlows allows grouping data and lifecycle information into objects. Expanding the previous example, this means that it is possible to have n Vacation Request, and also other objects, at run-time, each having their own attribute values and lifecycle processes. In consequence, as these lifecycle processes need only be coordinated at certain state changes, they can be executed in parallel. Towards Hyperscale Process Management 151 Fig. 3 shows an illustration of how n clients can interact with the m objects existing at run-time. The objects are largely independent of one another, except for when their states change. Furthermore, the n objects and, therefore, the data and computational resources needed to serve the m users, can be partitioned over n systems. Client 14 Client 5 Client 13 Client 3 Client 7 Client 11 Client 10 Client Machine Client 6 Client 12 Client 2 Client 9 Client 4 Client 1 Client 8 Name Object Ins tance (lifecycle + data) O1 O2 O2 O1 O4 O3 O2 O2 O3 O3 O2 O1 O4 O2 Name O4 O1 O2 O2 O2 O1 O2 O3 O3 O1 O1 O1 O3 Fig. 3: Example PHILharmonicFlows Process Instance with Clients We compare these two basic ideas, involving either an activity- or a data-centric theoretical basis, by applying Gustafson’s law to them. Gustafson’s law (1) states that the theoretical speedup S(n) for n processors can be calculated using the fraction of the problem that can be executed in parallel, p, as well as the fraction that must be executed in serial, (1-p). S(n) = (1 − p) + n × p (1) While calculating factor p for a given activity-centric process model is theoretically possible through analyzing the model and determining the fraction of parallelizable tasks, n will always be limited to the average amount of branches that can be executed in parallel. The exact values for p and n are not relevant, which may be illustrated for a trivial case by examining the process model from Fig. 2. The value for p is around 35 , depending on the applied metrics, while the maximum value for n is between 1 and 2, also depending on the exact metrics. So, assuming we have 1 processor executing the process model, this means that we obtain our sequential base speed S(1) = (1 − 35 ) + 1 × 35 = 1, i.e. 100%. For an approximate maximum n of 32 , the speedup factor S( 32 ) = (1 − 35 ) + 23 × 35 = 1.3, i.e. 130% of the base speed. As n, for the process model depicted in Fig. 2, can never reach the value 2 or above, and the speedup factor for n = 2 is 160%, it is obvious that activity-centric process models are limited in their scalability, which is, in turn, determined by the structure of the process model itself. These calculations, however, are very different for approaches like PHILharmonicFlows, where n is not limited by the structure of the individual objects. As the entire business process consists of an unlimited number of objects, the limit for n is the amount of object instances present in a process instance. Clearly, if only 2 objects are in use at run-time, the maximum value for n is 2, however, the maximum value for n scales upward with each object present in a process instance. There is, however, a limitation to the p factor, as the coordination of the objects’ interactions among each other is handled by one or more coordination processes, which are notified when an object changes its state. The exact value of p depends on the amount of rules defined in the coordination process, but even assuming that only 10% of the work can be executed in parallel and that there are 100 objects present 1 in a process instance, the speedup S(100) is (1 − 10 ) + 100 × 10 1 = 10.9, i.e. 1090%. For a more realistic value of p = 0.8 the expected speedup increases to 8020%. 152 Kevin Andrews, Sebastian Steinau and Manfred Reichert Obviously, even for very inefficiently structured PHILharmonicFlows processes, there is a far greater potential speedup from scaling over multiple machines and processing cores than with an activity-centric approach. An important fact to note is that the scaling possibilities do not only increase with the hardware capabilities, but also with increased load on a single process instance, i.e., more clients and more objects created. Naturally, these are rough calculations based on simple process models, which will have to be examined in greater detail, both theoretically, as well as empirically in different practical scenarios. 4 Outlook Based on the idea presented in this paper, we plan on further examining the viability of data-centric approaches for creating a hyperscale process engine. Furthermore, we plan on taking the existing PHILharmonicFlows process engine implementation and calculating the optimal partitioning of its conceptual elements into individual micro-services. Ideally, we will be able to determine a partitioning schema ensuring the best values for all relevant performance indicators, such as increased network and communication and load balancing efficiency. Finally, once the theoretical groundwork is established, we hope to create a working prototype of hyperscale process engine based on PHILharmonicFlows. The engine we envision should be capable of running in the cloud and servicing a large number of concurrent interactions with a single process model. References [Am67] Amdahl, Gene: Validity of the single processor approach to achieving large scale computing capabilities. In: AFIPS Spring Joint Comp Conf. pp. 483–485, 1967. [BRD03] Bauer, Thomas; Reichert, Manfred; Dadam, Peter: Intra-subnet load balancing in distributed workflow management systems. Intl J of Coop Inf Sys, 12(03):295–323, 2003. [Ca03] Cao, Junwei; Jarvis, Stephen A; Saini, Subhash; Nudd, Graham R: Gridflow: Workflow management for grid computing. In: 3rd IEEE/ACM International Symp on Cluster Comp and the Grid. pp. 198–205, 2003. [CGS97] Ceri, Stefano; Grefen, Paul; Sanchez, Gabriel: WIDE - A distributed architecture for workflow management. In: 7th Intl Workshop on Research Issues in Data Engineering. IEEE, pp. 76–79, 1997. [Gu88] Gustafson, John: Reevaluating Amdahl’s law. Comm of the ACM, 31(5):532–533, 1988. [KR11] Künzle, Vera; Reichert, Manfred: PHILharmonicFlows: towards a framework for object- aware process management. J of Soft Maintenance and Evolution: Research and Practice, 23(4):205–244, 2011. [KWR11] Künzle, Vera; Weber, Barbara; Reichert, Manfred: Object-aware business processes: Fundamental requirements and their support in existing approaches. Intl J of Inf Sys Modeling and Design, 2(2):19–46, April 2011. [Sc04] Schuler, Christoph; Weber, Roger; Schuldt, Heiko; Schek, H-J: Scalable peer-to-peer process management - the OSIRIS approach. In: IEEE Intl Conf on Web Services. pp. 26–34, 2004.