-

10.1561/2000000009

Why are Italian trials taking so long? A process mining approach

Barbara Pernici

Alessandro Campi

Marco Dilettis

Paolo Gerosa

0 0 Politecnico di Milano - DEIB , piazza Leonardo da Vinci, 32, Milano, 20133 , Italy

2022

35 29 31

The duration of processes is a critical aspect in the Italian judicial system. In this paper we introduce an original approach based on process mining and machine learning techniques to analyse temporal aspects of processes represented as a Finite State Machine. We analyse both the variants of trials execution in terms of possible sequences of states and their duration and the impact of single events on their completion time. A case study based on civil cases registries of the Court of Appeal of Milan is discussed.

eol>Judicial Systems Process Mining Trials temporal analysis

1. Introduction dataset from the Court of Appeal of Milan, we examine the key factors contributing to the slow pace of civil Judicial systems are responsible for supporting the func- justice. The aim of the research is to contribute to the tioning of the economy by ensuring the protection of ongoing eforts to increase the eficiency of civil justice, property rights and the enforcement of contracts. Empir- in particular considering reduction of duration of trials. ical studies show that the ineficiency of justice, due to In Section 2 we illustrate the state of the art. Section 3 the length of proceedings and the lack of "legal certainty", discusses the considered scenario. The proposed analyses depresses the economy and contributes to creating a cli- methods and initial results are illustrated in Section 4. mate of uncertainty and distrust which negatively afects the entrepreneurial and innovative capacity of a country.

More specifically, ineficient civil justice has a negative 2. Related work impact on the cost structure of firms, on the allocation and cost of credit, on the birth rate of firms, their ability to enter markets and competitiveness, on the size of production units, on domestic investments and on the ability to attract foreign investments [1]. It is estimated that delays and ineficiencies in justice generate a loss of over 16 billion euros, equal to 1 per cent of GDP, consequently slowing growth [2].

Understanding the causes of these delays is crucial for improving the eficiency and eficacy of civil justice systems. In this work, we aim to investigate the reasons behind the slow functioning of civil justice by analysing data extracted from the log of the Finite State Machine (FSM) that memorises all events related to a civil trial.

The FSM is a valuable resource, as it provides detailed insights into the various stages of a civil process, allowing us to track the progression of cases and identify bottlenecks or ineficiencies. By leveraging a comprehensive

Judicial systems in recent years have been more and more

supported by information systems that allow storing registries of events occurring during processes and linking them to the oficial documents and acts related to them.

The number of trials being recorded in the Italian Civil digital judicial information system (SICID) has increased, reaching 100% of cases in the Court and Court of Appeal of several cities including Milan. The increased use of information systems allowed analysing more and more in depth some Key Performance indicators, such as the Disposition Time (DT) and the Clearance Rate (CR) of cases. Several eforts are being conducted at the European level in the direction of monitoring the performance not only of terminated cases, but also of ongoing trials [3].

Recent data report that Italian courts, while reducing backlogs, have still much longer durations of trials than in other European countries1.

As analysed by some authors [4, 5], it is important not only to analyse the global behaviour of processes, but also their critical situations, and the application of process mining techniques is advocated. Process mining [6] allows analysing a process through the steps of 1https://www.coe.int/en/web/cepej/special-file\ protect\discretionary{\char\hyphenchar\ font}{}{}report-european-judicial-systems-cepej-evaluation-report-2022-evaluation-cycle-2020data extraction, data preparation, process discovery, conformance and compliance checking, and performance analysis. Recently, AI-augmented process mining is being investigated. In [7] waiting times are investigated, classifying diferent types of causes of delays such as batching or resource contention, and proposing methods to identify them on the basis of mining activity transitions. In the case of trials, other typical events that would also be necessary to investigate are blocking events and events linked to waiting times inherent to the normal execution of a process (e.g., waiting for the date fixed for the hearing).

Process mining has been applied to judicial systems in Brazilian courts [4] to derive process maps which are Figure 1: Sample registry used to identify slow transitions and activity bottlenecks and to analyse the times of processes on the basis of different analysis dimensions, e.g., comparing paper-based 3.2. Data preparation and digital processes. In [5], process mining based on causality graphs is performed considering outlier cases, The data pertaining to the progression of a civil process which allows identifying the main events that may delay is stored within an Oracle database. Before starting the the processes. However, this type of analysis does not analysis, all data must be imported into our database. The allow to identify causes of delays that are not linked to raw data may include inconsistencies, errors, or missing pairs of events and further research is needed to analyse values. To clean the data, we will identify "impossible valthe impact of events in general. Other recent directions ues", such as events with future dates or those too far in in applying AI techniques to trials, are presented in [8], the past (out of the timeframe covered by the given logged where diferent deep learning techniques are applied to data), and incomplete data, such as civil processes misspredict the duration of a phase. However, in these analy- ing records of essential steps. Data identified during this ses, the sequences of events are not being considered and process, which constitutes less than one per cent of the the focus is on a single phase. More general methods are total, will be removed and disregarded. We will also elimneeded, taking into consideration the diferent possible inate data related to civil procedures that were opened variants of processes and sequences of events. and closed on the same day, as these do not represent complete civil processes and could impact the evaluation altering the data calculation. 3. Scenario Data analysis will involve clustering the data based on the tribunal section number and the subject matter of 3.1. Italian digital civil judicial system the process. This is because diferent sections and matters necessitate distinct procedures, and as such, varying events and timeframes are expected.

The final stage of data preparation involves calculating additional variables, such as the duration between events, the total duration of a case, and the duration of diferent case phases.

The Italian information system for civil cases (SICID) is based on a FSM and it supports the activities by the Chancellor and the decisions of judges recording of events as exemplified in Fig. 1, linking them to the relevant documents and acts. For each of them, the date of the event is stored, together its recording date, the type of event, and the states before and after the event. Privacy is of utmost importance when dealing with individuals involved in 4. Temporal analyses civil cases; as such, the data is provided in an anonymized form. This information is the basis for the analyses de- In the following, we present diferent types of analysis. scribed in the following of the paper, which is focusing First, we concentrate in analysing the diferent phases of on temporal analyses of trials durations. a process, analysing their states and their duration.

The analyses illustrated in this paper are based on the We then focus on a finer perspective of the analysis, analysis of more than 15,000 defined civil cases in the analysing events within a single state in the direction last five years within the Court of Appeal of Milan. The of identifying the events which have a larger impact of processes under consideration pertain Ordinary second their duration. degree procedures (4O rite), for four sections, considering Litigations (Contenzioso).

4.1. Variant analysis with process mining 4.2. Analysis of the impact of events

The analysis of variants of a legal process using process After analysing each state of the process, we can proceed mining methodologies, such as Apromore2, is a funda- to investigate the most critical states more in detail to mental practice to understand and improve eficiency understand which events have a greater influence on the and efectiveness of legal processes. Process mining is a total time duration of a state. data-driven technique that allows examining, analysing, Measuring the impact of a single event on the duration and optimising processes using information extracted of a state is challenging. Since the interdependencies from event logs. between diferent events are not known in advance, the

Variants of a judicial process refer to the diferent paths data cannot be treated as a time series because the durathat a case can follow during its evolution, which includes tion of an event cannot simply be calculated as the date steps such as assignment to a section, appointment of the of the following event minus the date of the event itself. judge, waiting for hearings or documents, and decision To overcome these challenges, we use machine learnphases. Analysing these variants can help identify recur- ing models, following [9], to create a regression model ring patterns, ineficiencies, and areas for improvement that predicts the time of a state based on the presence or in the justice system. absence of specific events 3.

Using Apromore, an efective open-source process min- The final aim of this model is to interpret the covariates ing platform, one can analyse in depth the variants of in order to understand which events have a significant a legal process, the process variants can be analysed impact on the state. and, based on the information gained from variant and To develop the regression models, we tested a range performance analysis, changes can be proposed to the of machine learning techniques. The dataset includes execution of juridical processes to improve eficiency, the events that occur in a state and the corresponding reduce waiting times and increase user satisfaction. total time duration of the state. The covariates are repre

As a first type of analysis, we focus on states and sented by the presence or absence of the events, while the transitions between states, ignoring internal events the outcome variable is the duration of the state. We within states. By examining the sequences of states rather performed analyses with decision trees, random forests, than individual events, the complexity arising from the and gradient boosting algorithms to train diferent modmultitude of events and their potential combinations is els, and we use k-fold cross-validation to evaluate the avoided. In this way, a clearer and more comprehensible performance of each model. Once a model predicts the representation of the process is achieved and the main outcome accurately, we then interpret the covariates to variants can be identified. understand which events are more important in terms of

From the analysis of the variants, three main paths total duration of the state. emerge that a case can follow, covering 66% of the cases. Notice that simple feature interpretation directly from The first variant, which accounts for about 48% of cases, the predictions of the model is not enough to clarify how corresponds to the base variant of the ordinary second de- the covariates afect the outcome. Indeed, even if you gree process, which includes all the typical stages of such examine a specific scenario where the number of legal process, from its initial registration to the publication processes is low, the number of covariates (the events in of the judgement. The second and third variants, with the state) is still high; hence it is too complex to qualifrequencies around 10% and 8% respectively, represent tatively understand the interaction between the events. shortened paths, having respectively one and two states Spotting the dependencies between the events is fundaless than the main variant. These shortened routes may mental to evaluate which feature has a significant impact be the result of simplified legal procedures, agreements on the outcome of the prediction. For this reason, we use between the parties or other special circumstances. The Shapley values and permutation importance methods to rest of the variants, although less frequent, represent interpret the covariates in the model [10]. Shapley valmore particular cases that can ofer useful information ues are a game theoretic approach to assign importance on specific situations or exceptions to the standard. It is scores to the features in a model. It helps us understand also possible to compare processes which have a similar how each event contributes to the duration of the state. sequence of events, in terms of their execution time, as By calculating the Shapley values of each event, we can well as to analyse their individual phases, and identify identify the events that have the most significant impact both general critical issues and the most critical phases. on the total time of the state.

Permutation importance, on the other hand, measures the importance of each feature by randomly permuting the values of that feature and calculating the resulting decrease in the performance of the model. By comparing

2https://apromore.com/ 3All the models have been developed via Python 3.11 4.3. Predictive techniques for process duration

In addition to analyzing single state, our research work focuses on creating predictions models for the duration of an ongoing process or even before it begins, based on historical data and information extracted from variant analysis. These models can be used to predict both the states that will be traversed during the process and the overall duration of the case. To this purpose, we can use timeline models, such as Markov chains [11] or RNN models (Recurrent Neural Networks) [12]. These models can learn the transition probabilities between states and Figure 2: Graphical representation of the Shapley values of a predict the most likely sequence of future states, considRandom Forest model for this for the state "UT" (Waiting for ering the process variants observed in historical data. To the first hearing predict the overall duration of a process, it is possible to use regression models, such as linear regression, tree models such as Random Forest [13] and XGBoost [14], or the decrease in performance of the model when each deep learning models such as artificial neural networks feature is permuted, we can determine the features that [15]. These models can be trained on numerical and cathave the most significant impact on the outcome. egorical variables, such as process type, process variant,

For example, Fig. 4.2 shows preliminary results for a start year and month, and other relevant information, to specific case; the events are sorted by importance and estimate the duration of the process based on the specific the coloured dots of the image are associated to each characteristics of the case. For prediction of duration of prediction of the legal processes. Blue dots represent the cases, we focus on Markov Chain Models, which have processes where the event is absent; violet and red dots the advantage of being more easily interpretale wrt other (darker dots) represent the processes where the event is approaches. Prediction of durations of states is discussed present once and more than once. The dots on the right in Sect. 4.3.2. side of the x-axis represent the predictions where the event influences positively the outcome; vice versa for 4.3.1. The Markov Chain Model for juridical the left side of the x-axis. process analysis and prediction

We can observe that in general red dots are on the right side of the x-axis, meaning that the presence of an event The Markov chain is a statistical model used to analyse influences positively the total duration of the state; this the transition probability of a system of discrete states was also expected a priori. Moreover, we can examine over time. The Markov model is based on a transition which events are more important and take into consider- matrix, which describes the probability of transition from ation only the flexible and optimisable ones. Interesting one state to another. For example, suppose having a events in the state of Waiting for the first hearing are judicial process with three discrete states: the state of "XV" and "MI". The first one represents the event of a ifling of a case, the state of the first hearing, and the state lawyer who asks the permission to analyse the file of of the judgement. the process; the second one describes a postponement of In this case, for instance, the transition in the transition the court hearing asked by one of the parties. Both are matrix indicate that the probability of moving from the important in terms of Shapley values and both are events state of filing a case to the state of the first hearing is that can be considered for optimisation. 60%, while the probability of moving from the state of

As a result, using machine learning models and tech- the first hearing to the judgement is 20%. niques such as Shapley values and permutation impor- In general, the implementation of a Markov model for tance, we gain a deeper understanding of how the events predicting the path of a legal process involves several contribute to the duration of trials. By creating a regres- stages. The first step is to define the discrete states of the sion model, we can predict the time of a state based on system. In the case analysed, the system is characterised the presence or absence of specific events. by a total of 48 possible states. Next, data related to past judicial processes needs to be collected to estimate the chances of transition between states. Finally, once the transition matrix has been defined, it can be used to predict the future path of the judicial process.

The use of Hidden Markov Model (HMM) [16] can be an interesting alternative for predicting the path of a judicial process compared to the simple Markov model.

In an HMM, an additional element is introduced with respect to the Markov model, namely observation. In other words, each state can be associated with an observation that depends on the characteristics of the judicial process in that state. For example, for the judicial process, the observation could be the duration of the trial up to that point. The main advantage of using an HMM over the simple Markov model is that, thanks to observations, it is also possible to capture the uncertainty associated with transitions between states. In other words, the probability of transition from one state to another also depends on the observation associated with the current state, which makes the model more accurate.

In Fig. 3, a graphical representation of the transition matrix in the case study is provided. The nodes of the graph represent the 48 states identified in the logs. The colour intensity of each node depends on the number of times the state is traversed, while the opacity of the arcs depends on the probability of moving from the starting node to the destination node. such as regression or survival analysis. These probability distributions can then be used within the Markov or HMM model to estimate the overall duration of the process. Another approach is to use machine learning techniques to predict the duration of individual process states. One of the main features of this approach is its lfexibility. Machine learning models can be tailored to the specific needs of the judicial process and can capture non-linear relationships between variables that influence its duration. In addition, machine learning models can also be used when data is missing or incomplete.

All implemented models are developed using the Python programming language. The Scikit-learn machine learning library is utilised for the development of all deep learning models, the simple Markov chain models are developed from scratch, and the HMM model developed using the hmmlearn library. A first implementation has been carried out using the XGBoost model.

The dataset used consists of more than 15.000 defined processes and has been divided randomly into training set and validation set according to an 80%/20% ratio. Input variables include the year and month of registration of the trial, the section and judge to whom it was assigned, the role, juridical matter and juridical object.

More variables are available and will be added in the future. The current model generates predictions with a mean absolute error of 3.85 months. The analysis of the residuals shows some heteroskedasticity, probably due to the omission of independent variables, scalar efects and nonlinear relationships not properly considered, and it will be addressed in future developments. 4.3.3. Benefits of predictive and simulation

models in juridical process management

A predictive and simulation models of legal processes can

have a significant impact on the organisation of a court, contributing to the eficient management of resources and achieving the government’s goals of reducing timeframes.

Firstly, an accurate predictive model of the duration of legal processes and the states that will be encountered can help distribute the workload among diferent sections and judges. By knowing the expected duration and complexity of various cases, court oficials can assign cases equitably, avoiding overloading and ensuring that

4.3.2. Approaches to predicting duration of legal each judge has a manageable workload.

process states Secondly, predictive models can help identify botWith regard to predicting the duration of individual pro- tlenecks and ineficiencies in the judicial system. By cess states and the overall duration of the process, there analysing process variations and their duration, court are several approaches that can be used independently oficials can identify the process stages that require more or in combination with the Markov or HMM model. A time and resources, and implement strategies to improve common approach is to use historical data to estimate the eficiency in these critical areas. Additionally, process probability distribution of the duration of each process simulation models can be used to evaluate the impact state, for example by using statistical analysis techniques of possible procedural or regulatory changes. By using