Explainable Interpretation of Low-Level Process Traces via Abstract Argumentation⋆ (Discussion/Short Paper) Bettina Fazzinga1 , Sergio Flesca2 , Filippo Furfaro2 and Luigi Pontieri3,∗ 1 DIMES - University of Calabria, Via P. Bucci, 87036 Rende, Italy 2 DICES - University of Calabria, Via P. Bucci, 87036 Rende, Italy 3 ICAR-CNR, Via P. Bucci, 87036 Rende, Italy Abstract Monitoring and analyzing process traces (i.e. event sequences) is an important task in modern companies and organizations. Focusing on scenarios where there is an abstraction gap between trace events and reference business activities, we here consider the interpretation problem of translating the current event generated by an ongoing trace into the step of the activity instance it corresponds to. We briefly discuss how this problem can be encoded into an Abstract Argumentation Framework (AAF), which allows for computing and exploring event interpretations by solving instances of the AAF acceptance problem. Keywords Business Process Intelligence, Log Abstraction, Abstract Argumentation 1. Introduction Process Mining methods help better understand and enact business processes by supporting various analysis tasks over process traces, i.e. sequences of execution events, e.g. process-model discovery, log-to-model conformance analysis, runtime detection/prediction/recommendation. However, these methods usually assume each event to map to a well-understood “high-level” activity, which is not always the case in practice. As an instance, when the activities are performed in a lowly-structured way, trace events just represent low-level actions [2] with no clear reference/mapping to the activities. An example of this kind is presented below, for a toy hospital scenario where careflow process traces are gathered, which consist of events describing exams and checks performed by medical staff. Example 1. Suppose that a trace Φ = 𝑒1 , 𝑒2 , 𝑒3 , 𝑒4 is given, where 𝑒1 , 𝑒2 , 𝑒3 and 𝑒4 refer to events (precisely, event types) Blood sample taken, Blood pressure measurement, Temperature mea- surement, and Cannula insertion, respectively. Assume that it is known that the first three events can be performed during any of the high-level activities 𝐴1 =pre-hospitalization, 𝐴2 =pre-surgery, PMAI@IJCAI22: International IJCAI Workshop on Process Management in the AI era, July 23, 2022, Vienna, Austria ⋆ Extended Abstract: the original work [1] was published on journal Information Systems in January 2022. ∗ Corresponding author. Envelope-Open bettina.fazzinga@unical.it (B. Fazzinga); flesca@dimes.unical.it (S. Flesca); furfaro@dimes.unical.it (F. Furfaro); luigi.pontieri@icar.cnr.it (L. Pontieri) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 𝐴3 =post-surgery, while 𝑒4 can be performed only during activity 𝐴2 . To reconstruct/monitor patient histories, one needs to interpret such a trace in terms of activity executions. However, due to the many-to-many event-activity mapping, even if trace Φ described a completed process instance, multiple interpretations could exist for Φ, like: 𝐼1 ≡ “𝑒1 , 𝑒2 , 𝑒3 is the sequence of steps produced by an instance of 𝐴1 and 𝑒4 is the unique step of an instance of 𝐴2 ”; and 𝐼2 ≡ “𝑒1 , 𝑒2 and 𝑒3 , 𝑒4 are the step sequences produced by instances of 𝐴1 and of 𝐴2 , respectively”. □ Such an abstraction gap limits the usefulness of process discovery methods, and impedes using standard conformance-checking ones. Thus, supervised log abstraction methods were proposed recently [2], which try to interpret log traces automatically as executions of given high-level process activities, leveraging domain knowledge. However, these methods suffer from two limitations: (𝑖) they map each (low-level) event 𝑒 to just one activity, discarding alternative interpretations of 𝑒; (𝑖𝑖) they do not provide explanations for their results. In [1], we defined a framework that supports interactive explorative analyses of any low- level trace Φ, which allow the user to obtain answers and explanations for Interpretation Queries over Φ. Technically, the underlying event-interpretation problem is encoded into an Abstract Argumentation Framework (AAF ) (see Section 2), so that both interpretation queries and associated explanations are computed by solving instances of the AAF acceptance problem. This extended abstract summarizes some major features of the framework introduced in [1]. 2. Background and problem statement AAF basics An AAF can be viewed as a graph where nodes and edges model arguments and attacks from an argument to another, respectively. Given argument set 𝑆 and argument 𝛼, we say that “𝑆 attacks 𝛼” (resp., 𝛼 attacks 𝑆) if there is an argument 𝛽 in 𝑆 such that 𝛽 attacks 𝛼 (resp., 𝛼 attacks 𝛽). Given arguments 𝛼, 𝛽 such that 𝛼 attacks 𝛽, an argument 𝛾 is said to “defend 𝛽” if 𝛾 attacks 𝛼. Argument set 𝑆 is said to “defend 𝛽” if 𝑆 contains some argument defending 𝛽. Argument 𝛼 is acceptable w.r.t. 𝑆 if every argument attacking 𝛼 is attacked by 𝑆, while 𝑆 is conflict-free if there is no attack between its arguments. Different semantics have been proposed to identify “reasonable” sets of arguments, called extensions [3], in an AAF. In particular, a set 𝑆 ⊆ 𝐴 is said to be an admissible extension iff 𝑆 is conflict-free and all its arguments are acceptable w.r.t. 𝑆. An admissible extension that is maximal (w.r.t. ⊆) is a preferred extension. Acceptance of an argument 𝛼 in an AAF 𝐹 can be stated under both skeptical and credulous perspectives. In particular, adopting the preferred semantics (as in [1]), 𝛼 is credulously accepted (resp., skeptically accepted), if it belongs to at least one (resp., every) preferred extension of 𝐹. The interpretation problem Given an ongoing trace Φ, as soon as a new event 𝑒curr appears in Φ, we want to interactively support the analyst in obtaining (𝑎) answers to Interpretation Queries on the alternative valid interpretations of 𝑒curr , and (𝑏) explanations for certain interpre- tations considered not valid, where interpretation validity descends from well-structuredness properties and given domain knowledge (described later on). Roughly speaking, basic interpre- tation queries allow for asking whether 𝑒curr can be viewed as a step of an instance of a given activity 𝐴, possibly stating conditions on the activity lifecycle (e.g.,“is 𝑒curr the initial step of an instance of 𝐴?” or on the activity occurrence (e.g., “is 𝑒curr a step of the 2nd instance of 𝐴? ”). Any such a query can be evaluated under skeptical semantics, in order to know whether the proposed interpretation is the unique valid one for 𝑒curr . Moreover, enumeration queries ask for all valid interpretations of 𝑒curr satisfying the conditions required. Two kinds of domain knowledge are considered in the above-described problem: (𝑖) the list of high-level activities in terms of which events must be interpreted, along with the candidate type-level mappings between these activities and log events; (𝑖𝑖) a declarative process model consisting of activity constraints, including temporal rules of the forms 𝑀𝐶𝑗 ∶ 𝐴 ⇒𝑇 𝐵1 | … |𝐵𝑘 (must-constraint) and 𝑁 𝐶𝑖 ∶ 𝐴 ⇒𝑇 ¬𝐵 (not-constraint), for any given activities 𝐴, 𝐵, 𝐵1 , … , 𝐵𝑘 and time period 𝑇 ∈ [1..∞], and their respective time-symmetric versions (i.e., positive and negative “precedence” constraints). 𝑀𝐶𝑗 (resp., 𝑁 𝐶𝑖 ) means that an instance 𝐴 must (resp., cannot) be followed by an instance of one of the activities 𝐵1 , … , 𝐵𝑘 (resp., of activity 𝐵) in 𝑇 steps. In Example 1, one could use such a constraint to model known behaviors like that 𝐴2 is always preceded by 𝐴1 immediately; if also knowing that 𝐴1 must start with Blood sample taken and end with Blood pressure measurement, this allows for discarding interpretations like 𝐼1 . 3. Overview of the Solution Approach The core problem of computing valid (w.r.t. domain knowledge) interpretations of events in a trace Φ is modelled as a dispute via an ad hoc AAF 𝐹 (Φ) = ⟨ Arg, Att ⟩, on top of which interpretation queries and explanation requests can be answered. 𝐹 (Φ) is built incrementally by adding arguments of the following kinds (and attacks between them) as a new step of Φ arrives: 1) Interpretation arguments of the form ⟨𝑒𝑖 , 𝐴, 𝑋 , 𝑗⟩, where 𝐴 is an activity label, 𝑗 is a positive integer and 𝑋 ∈ { first, intermediate, last, first&last} indicates a life-cycle execution stage for 𝐴. Such an argument encodes the interpretation of 𝑒𝑖 as a step of type 𝑋 of the 𝑗-th execution instance of activity 𝐴. Interpretation 𝐼1 in Example 1 can be encoded via interpretation arguments ⟨𝐴1 , 𝑓 𝑖𝑟𝑠𝑡, 1⟩⟨𝐴1 , 𝑖𝑛𝑡𝑒𝑟𝑚𝑒𝑑𝑖𝑎𝑡𝑒, 1⟩⟨𝐴1 , 𝑙𝑎𝑠𝑡, 1⟩⟨𝐴2 , 𝑓 𝑖𝑟𝑠𝑡&𝑙𝑎𝑠𝑡, 1⟩. 2) Undermining arguments of five different forms: (𝑖) NotInterpreted 𝑖 , meaning that no inter- pretation has been chosen for step Φ[𝑖]; (𝑖𝑖) NotEnoughExecutions 𝛼 , where 𝛼 = ⟨𝑒𝑖 , 𝐴, 𝑋 , 𝑗⟩ with 𝑋 ∈ {first, first&last}, meaning that 𝛼’s interpretation of 𝑒𝑖 as the start of the 𝑗-th instance of 𝐴 contrasts with interpreting no previous step as the start of the (𝑗−1)-th instance of 𝐴; (𝑖𝑖𝑖) NotStarted 𝛼 , where 𝛼 = ⟨𝑒𝑖 , 𝐴, 𝑋 , 𝑗⟩ with 𝑋 ∈ {intermediate, last}, meaning that 𝛼’s interpretation of 𝑒𝑖 as a non-initial step of the 𝑗-th instance of 𝐴 contrasts with interpreting no previous step as the start of the 𝑗-th instance of 𝐴; (𝑖𝑣) NotEnded(𝑗, 𝐴), where 𝑗 is an instance counter and 𝐴 an activity, meaning that the trace has terminated and some event was interpreted as the initial step of the 𝑗-th instance of 𝐴, but no event has been interpreted as the last step of this instance; (𝑣) 𝑀𝐶𝑞𝑖 , descending from some must-constraint 𝑀𝐶𝑞 ∶ 𝐴 ⇒𝑇 𝐵1 | … |𝐵𝑘 , and meaning that the 𝑖-th event of Φ cannot be interpreted as the final step of an instance of 𝐴, for no subsequent event (in 𝑇 steps) has been interpreted as the start of an instance of some 𝐵𝑖 . Fig. 1 sketches an excerpt of the AAF resulting from processing a partial trace Φ = 𝑒1 , … , 𝑒6 . Clearly, the AAF includes several conflicting event interpretations, encoded by interpretation arguments with mutual attacks, e.g.: 𝛼 ′ and 𝛼 ″ (representing alternative interpretations of the same event), 𝛼 ′ and 𝛽 ″ (interpreting distinct events as first steps of the first instance of 𝐴), and NotInterpreted3 NotInterpreted4 NotInterpreted5 NotInterpreted6 ... ... ''=< e3, B, last, 1> ''=< e4, A, first, 1> ''= ... '''=< e6, A, intermediate, 1> '=< e5, A, last, 1> ''=... '=< e3, A, first, 1> '=< e4, A, first, 2> '=< e6, C, first, 1> NotEnoughExecutions ' NotStarted ' MC15 Figure 1: A portion of 𝐹 (Φ) (edges with two arrows represent pairs of mutual attacks) 𝛾 ′ and 𝛿 ‴ (where 𝛿 ‴ interprets 𝑒6 as a step of an instance interpreted as closed by 𝛾). As an example of how the AAF ensures activity counters to be consistent, consider NotE- noughExecutions 𝛽 ′ , which attacks 𝛽 ′ and it is counter-attacked by 𝛼 ′ . Importantly, the AAF enforces the choice of an interpretation for each trace step. In particu- lar, it contains exactly one argument NotInterpreted 𝑥 for each step 𝑒𝑥 , that is attacked by the interpretation arguments over the same step 𝑒𝑥 and that attacks the interpretation arguments over the immediately preceding and following steps 𝑒𝑥−1 and 𝑒𝑥+1 . The AAF also ensures that activity instances can be prolonged only if previously started. For example, in Fig. 1, 𝛾 ′ interprets 𝑒5 as a non initial step of the first instance of 𝐴 and is attacked by NotStarted ′𝛾 , and is defended from this attack by 𝛼 ′ and 𝛽 ″ , that interpret some previous steps as the starting steps of the same instance of 𝐴. All the constraints in the given declarative process model are enforced as well. For instance, in Fig. 1, the must-constraint 𝑀𝐶1 ∶ 𝐴 ⇒1 𝐶 (triggered by argument 𝛾 ′ over step 5) is encoded by the undermining argument 𝑀𝐶15 and the attacks (𝑀𝐶15 , 𝛾 ′ ) and (𝛿 ′ , 𝑀𝐶15 ). Further attacks are included to guarantee that ∅ is the only admissible extension of 𝐹 (Φ) iff there is no valid interpretation of Φ (i.e. trace behavior deviates from the process model). This is the case of self-attacks on all NotInterpreted 𝑖 and the special attack from NotInterpreted 1 towards each argument of type NotEnoughExecutions, NotStarted, NotEnded. It was proven in [1] that there is a biunivocal correspondence between the set of valid interpretations of Φ and the set of preferred extensions of 𝐹 (Φ), so one can reason on the interpretations of the current event 𝑒curr by deciding the acceptance of interpretation arguments over 𝑒curr . Details on the computation of the AAF 𝐹 (Φ) and of answers to interpretation queries and associated explanation requests can be found in [1]. References [1] B. Fazzinga, S. Flesca, F. Furfaro, L. Pontieri, Process mining meets argumentation: Ex- plainable interpretations of low-level event logs via abstract argumentation, Information Systems 107 (2022). doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . i s . 2 0 2 2 . 1 0 1 9 8 7 . [2] S. J. van Zelst, F. Mannhardt, M. de Leoni, A. Koschmider, Event abstraction in process mining: literature review and taxonomy, Granular Computing 6 (2021) 719–736. [3] P. M. Dung, On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games, Art. Int. 77 (1995) 321–358.