A preliminary study on Business Process-aware Large
                                Language Models
                                Mario Luca Bernardi2,† , Angelo Casciani1,*,† , Marta Cimitile3,† and Andrea Marrella1,†
                                2
                                  Department of Engineering, University of Sannio, Piazza Roma 21, Benevento, 82100, Italy
                                1
                                  Department of Computer, Control and Management Engineering, Sapienza University of Rome, Via Ariosto 25, Rome, 00185, Italy
                                3
                                  Department of Law and Digital Society, UnitelmaSapienza, Piazza Sassari, Rome, 00185, Italy


                                                Abstract
                                                AI-Augmented Business Process Management Systems (ABPMSs) are innovative information systems with increased flexibility,
                                                autonomy, and conversational capability. These systems can be boosted by Large Language Models (LLMs), renowned for
                                                their ability to handle natural language processing tasks. Nevertheless, no significant empirical validations exist about their
                                                usefulness in process-driven decision support. In this study, we propose a business process-oriented LLM framework, for
                                                enacting actionable conversations with workers involved in a business process, leveraging Retrieval-Augmented Generation
                                                (RAG) to enrich process-specific knowledge. The methodology has been assessed to evaluate its capacity to produce precise
                                                responses to inquiries posed by users within a public administration context. The preliminary study shows the framework’s
                                                ability to identify specific activities and sequence flows within the targeted process model, thereby providing valuable insights
                                                into its potential for improving ABPMSs.

                                                Keywords
                                                Business Process, Decision Support Systems, Large Language Models, Retrieval-Augmented Generation


                                1. Introduction                                                                                        Natural Language Processing (NLP) tasks [6]. Thanks
                                                                                                                                       to their huge advantages, practitioners are progressively
                                AI-Augmented Business Process Management Systems utilizing LLMs across various domains, gaining signifi-
                                (ABPMSs) embody new human-centered information sys- cant benefits for industries and business operations while
                                tems distinguished by significant flexibility, autonomy, reshaping the dynamics of human interaction with man-
                                and extensive conversational and self-enhancement abil- agement systems [7]. Notably, LLMs have been trans-
                                ities. [1]. Thus, Artificial Intelligence (AI) expands con- forming several organizations towards the paradigm of
                                ventional process-aware Decision Support Systems (DSS) autonomous enterprise and enable ABPMSs to hold a
                                to facilitate prompt and effective decision-making by elu- central position in assisting human activities and deci-
                                cidating the underlying factors influencing the decisions sions across the system life cycle. Indeed, starting from
                                [2]. Integrating ABPMSs into human workflows may business processes, LLMs should transcend local reason-
                                introduce shifts in workforce dynamics, potentially lead- ing contexts, support the management of diverse scenar-
                                ing to a lack of trust [3]. One possible remedy for this ios, and enhance the business activities understanding
                                challenge is the incorporation of Conversational Systems [7]. In front of the recognized potentiality of LLMs to
                                (CSs). The emergence of CSs presents a promising av- assist human decisions in the business landscape [1],
                                enue for enhancing Business Process Management (BPM) this topic is few explored in literature [7] and, as far
                                initiatives, significantly empowering ABPMSs [4, 5]. The as we are aware, an empirical validation regarding the
                                adoption of Large Language Models (LLMs) could push efficacy of LLMs for process-aware decision support is
                                substantial advancements in these systems [5]. LLMs missing. In this research context, our work presents an
                                represent an emerging class of machine learning models innovative methodology for business process analysis
                                showcasing great performance in accomplishing various leveraging the usage of LLMs to develop a conversational
                                                                                                                                       process-aware DSS. We propose to adopt a process-aware
                                Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- Retrieval-Augmented Generation (RAG) [8] framework to
                                nized by CINI, May 29-30, 2024, Naples, Italy
                                *
                                  Corresponding author.
                                                                                                                                       extend process- and domain-specific knowledge, in the
                                †                                                                                                      direction of improving the conversational capability of
                                  These authors contributed equally.
                                $ bernardi@unisannio.it (M. L. Bernardi);                                                              a LLM to respond to business process-related inquiries.
                                angelo.casciani@uniroma1.it (A. Casciani);                                                             The overall system supports the user in a wide range of
                                marta.cimitile@unitelasapienza.it (M. Cimitile);                                                       process comprehension and execution tasks using natu-
                                andrea.marrella@uniroma1.it (A. Marrella)                                                              ral language. Our work evaluates the proficiency of the
                                 0000-0002-3223-7032 (M. L. Bernardi); 0009-0003-7843-8045
                                                                                                                                       methodology in producing precise and contextually ap-
                                (A. Casciani); 0000-0003-2403-8313 (M. Cimitile);
                                0000-0002-1031-0374 (A. Marrella)                                                                      propriate responses to process-related questions within
                                          © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License different settings. In particular, we investigate the effi-
                                          Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
cacy of the approach in a real-world scenario within the      3. The Business Process LLM
realm of public administration.
                                                               In this study, we present a business process-oriented LLM
                                                               framework, better detailed in [32]. The steps utilized
2. Related Work                                                for answering queries pertaining to business processes
                                                               are summarized in Figure 1. The overall architecture
As asserted in [4], the integration of CSs holds significant comprises two major phases: Knowledge Augmentation
potential for enhancing ABPMSs. Numerous methodolo- and Querying.
gies have emerged in recent years directed at leveraging
the capabilities of CSs to enhance various critical areas
                                                               Knowledge Augmentation The process-aware LLM
within BPM [5].
                                                               pipeline starts by considering a business process model
   In the sub-field of Descriptive Process Analytics, describ-
                                                               in input, resulting in the production of multiple chunks.
ing current business processes and identifying problems
                                                               This operation is undertaken to facilitate the LLM’s un-
and potential improvements, NLP and neural architec-
                                                               derstanding in generating responses. In this study, we
tures, proved their effectiveness in extracting process
                                                               utilized a Directly-Follows Graph (DFG) representation
models from natural language descriptions [9, 10, 11].
                                                               expressed in natural language.
Conversely, expressing business process models in natu-
                                                                  In fact, chunking aims to partition broad textual con-
ral language aids human comprehension [12, 13]. More-
                                                               tent into more manageable segments, enabling the LLM
over, conversational interfaces further enhance under-
                                                               to ingest only relevant context and overcoming limita-
standing and accessibility of process mining findings
                                                               tions imposed by its context window. To ensure mean-
[14, 15].
                                                               ingful chunks and mitigate unnatural segmentation of
   Predictive Process Analytics concerns building predic-
                                                               the process model, two distinct chunking strategies were
tive models to forecast the future state and performance
                                                               evaluated: fixed-size and recursive chunking.
of business processes. Specifically, current trends in this
                                                                  Subsequently, the framework proceeds to transform
area are centered around the development of conversa-
                                                               the raw input chunks into model embeddings for storage
tional interfaces to assist the what-if analysis of digital
                                                               in a vector index. These embeddings are dense, low-
process twins [16, 17] and predictive process monitoring
                                                               dimensional vectors designed to encapsulate semantic
[18, 19, 20].
                                                               information and contextual relationships necessary for
   Prescriptive Process Optimization primarily focuses on
                                                               the successive retrieval and generation operations.
improving processes, often by translating insights into ac-
                                                                  Afterward, the business process model embeddings
tionable steps aimed at enhancing process execution. CSs
                                                               are stored within a specialized vector database to enable
designed for this BPM area mainly support automated
                                                               efficient retrieval. This retrieval procedure is enacted
process optimization, suggesting adjustments to optimize
                                                               through semantic search that, in our case, relies on cosine
process performance across various indicators [21, 22].
                                                               similarity.
Additionally, these systems contribute to prescriptive pro-
cess monitoring, providing real-time recommendations
for actions to be taken, as illustrated in [23].               Querying The Querying stage begins with the retrieval
   Augmented Process Execution embodies the concept of the pertinent process model chunks needed for the
wherein system-driven management actively oversees crafting of precise responses to the process-related ques-
business process execution, with human operators pro- tions. In particular, this retrieval step involves fetching
viding support as needed. In this sub-field, various relevant process chunks from the vector store through
conversational agents have been developed to facilitate semantic search utilizing cosine similarity. Following
seamless interaction between systems and human users this, these segments, along with the user question, are
[24, 25, 26]. Furthermore, Robotic Process Automation fed into an LLM to generate an answer.
(RPA), which involves creating software robots to auto-           Ultimately, to offer contextually grounded answers
mate repetitive tasks on application user interfaces, will     based   on the user query and the retrieved information,
likely benefit from the combination with CSs. Such inte-       the  proposed   framework relies on two primary compo-
gration enables the automation of business processes nents: a LLM and its associated tokenizer. Initially, a
[27, 28, 29], and aids in identifying suitable routines prompt is formulated by merging the user query with
for automation through natural language interaction the previously retrieved process context. Subsequently,
[30, 31].                                                      the tokenizer converts the prompt into a format com-
                                                               prehensible by the model. Eventually, the prompt is fed
                                                               to the LLM to generate contextually relevant answers.
                                                               In particular, our process-aware approach integrates the
                                                               Llama 2 13B [33] model as the LLM.
Figure 1: The business process-oriented LLM framework.


4. Evaluation                                                 using the DFG expressed in natural language.
                                                                 The queries adopted in this evaluation require, to be
We performed a preliminary validation on the adoption         answered, to recognize both structural and behavioral
of the proposed framework by applying it to a real public     information within the model. By structural information,
administration procedure. The process model, illustrated      it is considered the presence of activities, events, and
in Figure 2, involves the reimbursement of expenses for       gateways in the process model whereas behavioral in-
missions, a critical procedure within a university. This      formation encompasses details concerning the sequence
administrative process entails the processing of expense      flows linking these entities.
reports submitted by employees and the subsequent de-            Specifically, for structural information correctness
cision to either reimburse or reject these reports. In par-   analysis, we queried the presence of specific activi-
ticular, the process was analyzed using textual DFG de-       ties within the business process model, prompting the
scriptions of activities and sequence flows.                  pipeline to answer with a simple "yes" or "no" and to
   The proposed framework, being rooted in generative         provide relevant contextual references if available.
models, provides feedback to users in natural language.          When assessing behavioral features, inquiries were ex-
To assess its effectiveness in aiding users’ comprehension    pressed to check the presence of sequence flows between
of business processes, the validation encompasses assess-     specified activities in the process representation. The
ing the accuracy of the answers concerning the entities       LLM was prompted to state their existence in a binary
and relationships present in both the process model and       manner, reporting contextual references.
the response of the LLM. The conclusion derived from             Striving to obtain a thorough evaluation, we analyzed
this research effort centers on evaluating the approach’s     all single-pass transitions, an equivalent number of se-
overall effectiveness in assisting business process users     quence flows between activities present in the model but
and discussing its potential applications in real-world       not directly connected, and the same number of flows
scenarios.                                                    linking tasks that do not belong to the process.
                                                                 First, we assessed the performance of the RAG-based
4.1. Evaluation Setting                                       framework in comparison to the basic version of the
                                                              language model for responding to the queries within the
All the evaluations are performed using the reimburse-        context of the reimbursement process model.
ment process model previously introduced, represented            Specifically, we estimated the capability of the LLaMA
Figure 2: The DFG model of the reimbursement process in a university.


2 13B model and the RAG-based pipeline in addressing        Table 1
related to business processes, employing accuracy as the    Accuracy obtained for basic LLM and RAG framework.
measure.                                                      Methodology           Representation         Accuracy
   For this reason, we designed an evaluation approach
                                                              Basic LLM             None                   40.18%
for assessing the performance of the framework relying        RAG-based framework   Natural Language DFG   72.37%
on binary response questions (expecting either a "yes" or
a "no" as allowed answers) to allow a rigorous assessment
of the provided answers. The accuracy quantifies the        comprehensive overview of the process model, enabling
proportion of exact predictions generated by the LLM in     the language model to generate grounded responses.
answering the user’s questions out of the total responses     The experiments were conducted on a workstation
provided. We classify predictions given by the framework    running the Linux/Ubuntu 22.04.3 LTS operating system
as true positives (TP) when they correspond to positive     and equipped with an NVIDIA A100 GPU.
expected outcomes and as true negatives (TN) when they
match negative expected outcomes. Vice versa, false
positives (FP) arise when the approach produces positive    4.2. Evaluation results
answers opposite to negative expectations, whereas false    We proceed to analyze the results obtained during the
negatives (FN) derive from negative answers generated       evaluation phase under various experimental conditions.
by the framework despite positive expected ones.              The results in terms of accuracy for the basic LLM
                                                            and the RAG-based pipeline on the reimbursement DFG
                               𝑇𝑃 + 𝑇𝑁
         𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =                                     (1) model described in natural language are presented in
                        𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁
                                                            Table 1.
                                                              The table demonstrates a notable improvement in accu-
   Subsequently, we estimated the effects of employing racy upon utilizing the RAG-based LLM, which is consis-
various chunking techniques within the process-aware tent with our expectations for the test. This enhancement
LLM pipeline, alongside investigating how prompt en- exhibits an acceptable performance level (72.37 percent)
gineering can further augment the framework’s perfor- for the framework, relying on the natural language repre-
mance. Fixed-size and recursive chunking with different sentation to drive more informed and accurate decision-
sizes are tested.                                           making.
   In both cases, the accuracy (reported in Formula 1) of     Our observations revealed instances of hallucination,
the framework in answering the queries is evaluated.        wherein the pure LLM would provide responses despite
   We carried out this evaluation employing an oracle that lacking pertinent information about the process model,
considers both the query and the corresponding binary occasionally asserting familiarity with certain activities
response as input. Such oracle compares the answers of even when such knowledge was absent.
the pipeline with the expected ones and computes the          Table 2 illustrates the accuracy computed using various
accuracy as the ratio of correct predictions to the total chunking methods, including no chunking, fixed-size
number of tests conducted in that particular assessment. chunking, and recursive chunking.
   In our experimentation, we found that by retrieving        Comparable outcomes are achieved through the us-
the top 20 chunks, we were always able to capture a age of a fixed-size strategy and a recursive technique
Table 2                                                      References
Accuracy obtained using different chunking strategies.
                                                              [1] M. Dumas, F. Fournier, L. Limonad, A. Marrella,
 Chunking                                 Accuracy
                                                                  M. Montali, et al., AI-augmented Business Process
 No Chunking
 Fixed
                                          79.52%
                                          81.58%
                                                                  Management Systems: A Research Manifesto, ACM
 Recursive                                82.89%                  Trans. Manage. Inf. Syst. 14 (2023). doi:10.1145/
                                                                  3576047.
                                                              [2] P. Agarwal, B. Gao, S. Huo, P. Reddy, et al., A
for chunking leveraging the natural language represen-            Process-Aware Decision Support System for Busi-
tation. In both cases, the ideal size for the chunks is           ness Processes, in: Proceedings of the 28th ACM
identified as 128 tokens with a 10-token overlap. We              SIGKDD Conference on Knowledge Discovery and
can attribute this observation to the relatively modest           Data Mining, KDD ’22, 2022, p. 2673–2681. doi:10.
scale of the process model, which causes its content to           1145/3534678.3539088.
be nearly encapsulated within a single chunk. Addition-       [3] J. D. Lee, K. A. See, Trust in automation: Designing
ally, the above consideration clarifies why the absence of        for appropriate reliance, Human factors 46 (2004).
chunking yields analogous results.                            [4] D. Chapela-Campa, M. Dumas, From process min-
                                                                  ing to augmented process execution, Software and
                                                                  Systems Modeling (2023) 1–10.
5. Conclusion                                                 [5] A. Casciani, M. L. Bernardi, M. Cimitile, A. Mar-
                                                                  rella, Conversational Systems for AI-Augmented
In conclusion, this work introduced a business process-           Business Process Management, in: Proceedings of
aware LLM, an innovative framework designed to facili-            the 18th Research Challenges in Information Sci-
tate actionable conversations and support process-aware           ence (RCIS 2024), 2024, pp. 1–16.
DSSs, thereby laying the ground for intelligent interac-      [6] I. Ozkaya, Application of Large Language Mod-
tion with ABPMSs. The proposed methodology, tailored              els to Software Engineering Tasks: Opportunities,
for aiding business process analysis, aims to enhance             Risks, and Implications, IEEE Software 40 (2023)
the conversational skills of LLMs in the business pro-            4–8. doi:10.1109/MS.2023.3248401.
cess context. This objective is realized through the de-      [7] D. Fahland, F. Fournier, L. Limonad, I. Skarbovsky,
velopment of a RAG-based architecture, which extends              et al., How well can large language models explain
its knowledge of the structural and behavioral aspects            business processes?, 2024. arXiv:2401.12846.
of process models by ingesting contextual information         [8] P. Lewis, E. Perez, A. Piktus, F. Petroni, et al.,
concerning specific inquiries. Consequently, the process-         Retrieval-Augmented Generation for Knowledge-
aware framework is equipped to assist users in under-             Intensive NLP Tasks, in: Proceedings of the 34th
standing and executing business processes through a               International Conference on Neural Information
natural language interface. Additionally, we assessed the         Processing Systems, NIPS’20, 2020, pp. 1–16.
performance of the process-aware LLM in providing pre-        [9] K. Sintoris, K. Vergidis, Extracting business process
cise and pertinent answers to the queries posed by the            models using natural language processing (NLP)
users across diverse evaluation scenarios.                        techniques, in: Proceedings - 2017 IEEE 19th Con-
   In future research within the domain of process dis-           ference on Business Informatics, CBI 2017, vol-
covery [34], we intend to delve into the analysis of the          ume 1, Institute of Electrical and Electronics En-
business process execution information and explore the            gineers Inc., 2017, p. 135 – 139.
impact of different embedding models on the developed        [10] H. van der Aa, K. J. Balder, F. M. Maggi, A. Nolte,
technique. Furthermore, investigating the integration             Say it in your own words: Defining declarative
of the framework with symbolic AI solvers to embed                process models using speech recognition, BPM
reasoning capabilities could present another intriguing           Forum (2020).
avenue for future work.                                      [11] C. Qian, L. Wen, A. Kumar, BEPT: A behavior-based
                                                                  process translator for interpreting and understand-
Acknowledgments                                                   ing process models, Int. Conf. on Information and
                                                                  Knowledge Management, Proceedings (2019).
The work of Angelo Casciani has been carried out in          [12] L. Ackermann, S. SchöNig, et al., Natural language
the range of the Italian National Doctorate on AI run by          generation for declarative process models, CAiSE
Sapienza.                                                         Workshops LNBIP 231 (2015) 3 – 19.
                                                             [13] Y. Fontenla-Seco, M. Lama, A. Bugarín, Process-To-
                                                                  Text: A Framework for the Quantitative Description
                                                                  of Processes in Natural Language, Trustworthy AI -
     Integrating Learning, Optimization and Reasoning              wards Hybrid Automation by Bootstrapping Con-
     Workshop (2020).                                              versational Interfaces for IT Operation Tasks, in:
[14] L. Barbieri, E. Madeira, K. Stroeh, W. van der Aalst,         AAAI, 2023, pp. 15654–15660.
     A natural language querying interface for process        [27] P. D. Hung, D. T. Trang, T. Khai, Integrating Chat-
     mining, Journal of Intelligent Information Systems            bot and RPA into Enterprise Applications Based on
     61 (2023) 113 – 142.                                          Open, Flexible and Extensible Platforms, in: Int.
[15] H. Yeo, E. Khorasani, V. Sheinin, I. Manotas, N. P. A.        Conf. on Cooperative Design, Visualization and En-
     Vo, O. Popescu, P. Zerfos, Natural Language Inter-            gineering, 2021, pp. 183–194.
     face for Process Mining Queries in Healthcare, in:       [28] G. Dan, D. Claudiu, F. Alexandra, et al., Multi-
     Proceedings - 2022 IEEE Int. Conf. on Big Data, Big           Channel Chatbot and Robotic Process Automation,
     Data 2022, Institute of Electrical and Electronics            in: IEEE Int. Conf. on Automation, Quality and
     Engineers Inc., 2022, p. 4443 – 4452.                         Testing, Robotics, 2022, pp. 1–6.
[16] D. Barón-Espitia, M. Dumas, O. González-Rojas,           [29] Y. Rizk, V. Isahagian, S. Boag, et al., A Conver-
     Coral: Conversational What-If Process Analysis, in:           sational Digital Assistant for Intelligent Process
     ICPM Demo, 2022, pp. 118–122.                                 Automation, BPM Forum (2020).
[17] M. Li, R. Wang, X. Zhou, Z. Zhu, Y. Wen, R. Tan,         [30] H. van der Aa, H. Leopold, Automatically identi-
     ChatTwin: Toward Automated Digital Twin Gen-                  fying process automation candidates using natural
     eration for Data Center via Large Language Mod-               language processing, Blockchain and Robotic Pro-
     els, in: Proceedings of the 10th ACM Int. Conf.               cess Automation (2022) 77 – 86.
     on Systems for Energy-Efficient Buildings, Cities,       [31] Z. Zeng, W. Watson, N. Cho, et al., FlowMind:
     and Transportation, BuildSys ’23, Association for             Automatic Workflow Generation with LLMs, in:
     Computing Machinery, 2023, p. 208–211.                        ACM Int. Conf. on AI in Finance, 2023, pp. 73–81.
[18] K. Brennig, K. Benkert, B. Löhr, O. Müller, Text-        [32] A. Casciani, M. L. Bernardi, M. Cimitile, A. Mar-
     Aware Predictive Process Monitoring of Knowledge-             rella, Conversational systems for ai-augmented
     Intensive Processes: Does Control Flow Matter?,               business process management, PREPRINT (Version
     in: Int. Conf. on Business Process Management,                1) available at Research Square (2024). doi:https:
     Springer, 2023, pp. 440–452.                                  //doi.org/10.21203/rs.3.rs-4125790/v1.
[19] L. Cabrera, S. Weinzierl, S. Zilker, M. Matzner, Text-   [33] H. Touvron, T. Lavril, G. Izacard, X. Martinet, et al.,
     Aware Predictive Process Monitoring with Contex-              LLaMA: Open and Efficient Foundation Language
     tualized Word Embeddings, in: BPM Workshops,                  Models, 2023. arXiv:2302.13971.
     volume 460 LNBIP, 2022, p. 303 – 314.                    [34] M. L. Bernardi, M. Cimitile, C. Di Francescomarino,
[20] C. Warmuth, H. Leopold, On the Potential of Tex-              F. M. Maggi, Using discriminative rule mining
     tual Data for Explainable Predictive Process Mon-             to discover declarative process models with non-
     itoring, ICPM Workshops 468 LNBIP (2023) 190 –                atomic activities, Lecture Notes in Computer Sci-
     202.                                                          ence (including subseries Lecture Notes in Arti-
[21] S. Badini, S. Regondi, E. Frontoni, R. Pugliese, As-          ficial Intelligence and Lecture Notes in Bioinfor-
     sessing the capabilities of ChatGPT to improve addi-          matics) 8620 LNCS (2014) 281 – 295. doi:10.1007/
     tive manufacturing troubleshooting, Advanced In-              978-3-319-09870-8_21.
     dustrial and Engineering Polymer Research (2023).
[22] A. Mustansir, K. Shahzad, M. K. Malik, Towards
     automatic business process redesign: an NLP based
     approach to extract redesign suggestions, Auto-
     mated Software Engineering 29 (2022).
[23] S. Zeltyn, S. Shlomov, et al., Prescriptive Process
     Monitoring in Intelligent Process Automation with
     Chatbot Orchestration, in: PMAI@IJCAI, 2022, pp.
     49–60.
[24] T. Chakraborti, S. Agarwal, Y. Khazaeni, et al.,
     D3BA: A Tool for Optimizing Business Processes
     Using Non-deterministic Planning, in: BPM Work-
     shops, 2020, pp. 181–193.
[25] L. F. Lins, G. Melo, T. Oliveira, et al., PACAs:
     Process-Aware Conversational Agents, in: BPM
     Workshops, 2021, pp. 312–318.
[26] J. Bandlamudi, K. Mukherjee, P. Agarwal, et al., To-