Design it like Darwin 2.0 An Evolutionary Algorithm for Automatic Process Redesign Jonas Manderscheid1, Maximilian Röglinger2, Tobias Ruby1, Johannes Seyfried2 1 FIM Research Center, University of Augsburg, Germany {jonas.manderscheid, tobias.ruby}@fim-rc.de 2 Project Group Business & Information Systems Engineering of the Fraunhofer FIT, Germany {maximilian.roeglinger, johannes.seyfried}@fim-rc.de Abstract. Process redesign is an important and valuable phase of the business process management (BPM) lifecycle. However, human creativity and objective- ness regarding continuous redesign initiatives are limited and biased. To over- come these limitations, we propose computational support based on evolutionary algorithms. Our software tool extends a formerly published proof of concept. Novelties have been introduced by including a new data structure, new mutation and crossover operators as well as an extended evaluation of unambiguous pro- cess designs explicitly considering time objectives. Finally, the tool provides new computation process (re-) design support to the BPM community. Keywords: business process management, process redesign, computational in- telligence, evolutionary algorithm 1 Background and Significance to the BPM field Business process management (BPM) is an acknowledged way to facilitate value-cre- ating process improvement activities [1]. Nevertheless, process (re-) design mainly re- lies on human intuition limiting the solution space [2]. The practical toolkit for process (re-) design still lacks computational support [3], even though algorithms can outper- form humans in creating and benchmarking processes due to their speed and neutrality [4]. Consequently, computational intelligence (CI) in terms of machine learning abili- ties has been introduced to business process (re-) design in the mid-nineteens. After initial process automation approaches [5], CI’s rising popularity meanwhile facilitated computational tool support for process (re-) design [6]. The tool portfolio ranges from semi-automatic approaches to the successfully application of evolutionary algorithms (EA), as part of CI, for automatic process (re-) design [7,8]. EAs describe heuristics tightly related to Darwin’s approach of natural selection, known as survival of the fittest [9], as the idea behind EA is similar to the BPM lifecy- cle: developing improved generations of business processes. In doing so, EA start with a population of known and/or randomly created genomes (i.e., process designs). EA operators then mimic the BPM lifecycle phases and foster the evolution of this popula- tion by creating new generations of genomes based on the most valuable existing ones. The mutation operator randomly replaces sub-processes with new gateways and/or tasks, whereas the crossover operator combines proven sub-processes of existing ge- nomes (see Sect. 2). The valuation of genomes is done by a fitness function depending on predefined criteria. The evolution cycle repeats until it results in a (potentially local) optimum or it is terminated manually. Afflerbach et al. [10] provide the first EA application that allows for exclusive splits and covers elementary process design elements (i.e., tasks, input/output objects, and sequence flows). In addition, it aggregated multiple dimensions of process performance to the main economic factors of cash flows and the risk considering time, quality, and flexibility [11]. They confirm the usefulness of their EA design in a proof of concept. However, the proof of concept still lacks scalability, platform- and vendor independ- ence, well-defined interfaces, visualization and analysis functionalities as well as a di- rect integration of performance effects. We therefore present a stand-alone process (re- ) design tool Design it like Darwin 2.0 that addresses many of these shortcomings. The tool enables researchers and practitioners to create new process designs based on any set of tasks (e.g., gathered by process mining). Particularly, the import/export feature provides well-defined interfaces as further step towards a highly automated BPM lifecycle. 2 Design it like Darwin 2.0 – The Tool Regarding the tool presentation, we start with an architectural overview followed by a description of the significantly enhanced backend features. Finally, we present the new analytics and visualization functionalities. The tool is a dynamic single page C# web application that also implements an appli- cation programming interface (API). Up to 19 settings allow for the customization of the EA itself (e.g., the maximum number of generations or population size) as well as process improvement goals (e.g., weights for the performance dimensions and the de- cision-makers’ risk attitude). Input data regarding tasks (i.e., name, description, ex- pected cash flow and variance, and execution time), input/output objects, input/output dependencies, process attributes, and attribute coverage can be inserted via user inter- face (see Fig. 1) or by a JSON file upload. Fig. 1. Tool user interface (Data tab) For the backend calculation, we implemented the following key functionalities, which enhance the existing proof of concept of Afflerbach et al. [10]: Greedy Algorithm. We apply a greedy algorithm to generate new genomes while randomly combining tasks according to their input/output objects, dependencies, and gateways. This choice significantly influences the performance. The resulting genomes are feasible (i.e., process attributes constraints and data input/output dependencies are considered), but rarely optimal in the first iteration. Besides, we can force the greedy algorithm to create only feasible genomes to facilitate an automated initial process design. Tree structure. We present processes as directed trees (see Fig. 2). Tree nodes correspond to parallel, sequential, and exclusive gateways, where leaves correspond to tasks. Thereby, we explicitly consider the semantics of BPMN gateways and their semantics in EA operations (mutation and crossover) and, thereby, extend the overall solution space compared to the existing proof of concept. A1 A2 AND SEQ SEQ A3 A4 A1 A2 A3 XOR A4 A5 A5 Fig. 2. Sample BPMN process transformation into a tree Genome Operators. With the introduction of the directed tree, we also further developed the implemented EA operators according to Poli et al. [12]. The crossover operator randomly switches two nodes of two genomes, while the mutation operator generates new subtrees using the greedy algorithm to replace existing ones. We also allow for limiting the tree depth to avoid unnecessarily complex process designs with multiply duplicated gateways and/or tasks not contributing to better solutions. Varying Crossover-/Mutation-Rates. As varying crossover and/or mutation rates is critical to the success of EAs, we start with a high mutation probability and a low crossover probability that will decrease or increase respectively by 1/𝑛 per generation [13], where n represents the number of generations. Thereby, we ensure a broad exploration of the solution space at the beginning and a narrow solution space exploitation at the end, fostering the convergence of the EA. Multi-criteria Evaluation. We explicitly model the information related to the process performance dimension time (i.e., fixed completion time per task) to foster the use of exclusive and parallel gateways. We finally calculate the genomes’ fitness as an individually weighted average of the task-related cash flows and time estimations based on user preferences [10]. This calculation covers the process performance dimensions time and cost. Pain factor. We widened the solution space by infeasible (i.e., violated input-output- constraints) and overly complex (e.g., redundant gateways or parallel gateways with only one path) process designs. As the best solution is close to the boundary between the solution space of feasible and infeasible designs, we tackle the best solution from both sides. To cope with such designs, we introduced a pain factor as a negative adjustment in their fitness, punishing infeasible and overly complex solutions. Besides, we ensure at least one feasible solution per generation through the greedy algorithm (see above) or the elitist selection [12]. For the presentation of the results, we introduced new analytics and visualization features. The analytics tab displays the development of several performance indicators in real-time (e.g., the fitness development, the percentage of feasible process designs, or the evaluation runtime, see Fig. 3). Besides, the analytics tab always presents the best process design so far as BPMN graph and provides a BPMN 2.0 XML file download for further processing. Fig. 3. Tool user interface (Analytics tab) 3 Maturity and Future Work Design it like Darwin 2.0 is a ready-to-use fully automated process (re-) designer applicable to academic and industrial contexts. We invite the BPM community to challenge their own process design ideas. From an evaluation perspective, we compared the enhanced algorithm to the proof of concept in [10] based on the same travel agent process from a modified real-life scenario. The greedy algorithm generates populations with thousands of genomes as recommended for EAs based on tree structures within seconds. As the best solution is close to the boundary, the example process already reveals the value of the extended solution space (i.e., feasible and infeasible designs). Even though the overall average of feasible genomes is about 10 to 15 percent per generation owing to infeasible process designs, the tool identifies the optimal solution more quickly than the initial proof of concept. Therefore, our proposed enhancements also have significant effects on perfor- mance. To further develop our tool, we plan to extend the fraction of supported BPMN ele- ments (i.e., all BPMN gateways and events) and, therefore, the complexity of supported processes as a next step. Besides, we intend to support process with loops of recurring tasks. To do so, we must upgrade the underlying data structure to a directed graph and implement corresponding EA operators according to Walker et al. [14]. We expect par- ticular challenges when extending our tool to support also declarative process models. The tool, installation instructions, and example data is available at https://github.com/rubytobi/Design-it-like-Darwin-2.0. Additionally, we provide a screencast showing the basic workflow with the application. The tool and its source code is available under MIT License. References 1. Dabaghkashani, A.: A Success Model for Business Process Management Implementation. International Journal of Information and Electronics Engineering 2(2), 725-729 (2012). 2. Sharp, A., McDermott, P.: Workflow modeling. Tools for process improvement and appli- cations development / Alec Sharp, Patrick McDermott, 2nd edn. Artech House, Boston, Lon- don (2009). 3. Zellner, G.: A structured evaluation of business process improvement approaches. Business Process Management Journal 17(2), 203-237 (2011). 4. van der Aalst, W.M.P.: Business Process Management. A Comprehensive Survey. Interna- tional Scholarly Research Notices Software Engineering 2013, 1-37 (2013). 5. Hammer, M., Champy, J.: Reengineering the corporation: A manifesto for business revolu- tion. Business Horizons 36(5), 90-91 (1994). 6. Bernstein, A., Klein, M., Malone, T.W.: The process recombinator. A tool for generating new business process ideas. In: Proceedings of Information Systems, 178–192 (1999). 7. Vergidis, K., Saxena, D., Tiwari, A.: An evolutionary multi-objective framework for busi- ness process optimisation. Applied Soft Computing 12(8), 2638-2653 (2012). 8. Low, W.Z., Weerdt, J. de, Wynn, M.T., ter Hofstede, A.H.M., van der Aalst, W.M.P., vanden Broucke, S.: Perturbing event logs to identify cost reduction opportunities. A genetic algorithm-based approach. In: Proceedings of Evolutionary Computation, 2428–2435 (2014). 9. Fogel, D.B.: What is evolutionary computation? IEEE Spectrum (2000). 10. Afflerbach, P., Hohendorf, M., Manderscheid, J.: Design it like Darwin. A value-based ap- plication of evolutionary algorithms for proper and unambiguous business process redesign. Journal of Information Systems Frontiers (2016). 11. Limam Mansar, S., Reijers, H.A.: Best practices in business process redesign. Use and im- pact. Business Process Management Journal 13(2), 193-213 (2007). 12. Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming. Lulu Press (2008). 13. Lin, W.-Y., Lee, W.-Y., Hong, T.-P.: Adapting crossover and mutation rates in genetic al- gorithms. Journal of Information Science and Engineering 19(5), 889–903 (2003). 14. Walker, J.A., Miller, J.F.: The Automatic Acquisition, Evolution and Reuse of Modules in Cartesian Genetic Programming. Transactions of Evolutionary Computation 12(4), 397-417 (2008).