1. Introduction

G. Rodríguez, J. David. AI Technologies in the Judiciary: Critical Appraisal of Large Language Models in Judicial Decision-making (December

10.3390/math10050683

Content Analysis of Court Decisions: A GPT-4 Based Sentence-by-Sentence Data Generation and Association Rules Mining

Olha Kovalchuk

o.kovalchuk@wunu.edu.ua 3

Ruslan Shevchuk

rshevchuk@ubb.edu.pl 2 3

Mariia Masonkova

Anhelina Banakh

a.banakh@st.wunu.edu.ua 0 3 0 Erasmus Universiteit Rotterdam , 50 Burgemeester Oudlaan, Rotterdam, 3062 PA , Netherlands 1 Kherson State Maritime Academy , 20 Ushakova Avenue, 73009, Kherson , Ukraine 2 University of Bielsko-Biala , 2 Willowa, Bielsko-Biala, 43-309 , Poland 3 West Ukrainian National University , 11 Lvivska str., Ternopil, 46009 , Ukraine

2017

18 2023 2393 2402

The key aspect of the content analysis of court decisions is the identification of interesting relationships between the circumstances of the case and the outcome. This article proposes an innovative approach that combines the use of a GPT-4 language model from the OpenAI API to generate the necessary facts from unstructured text documents of court decisions and association rules mining to identify patterns in the sets of criteria considered by the court when sentencing in similar cases. The analysis is based on a collection of 10,000 texts of sentences in criminal cases from the Unified Register of Court Decisions of Ukraine. Frequent item sets (support ≥ 0.982) and strong association rules (confidence = 0.987) were identified. It was found that persons sentenced to imprisonment, in most cases, committed crimes in complicity and/or had previous convictions and/or committed repeated crimes. It was revealed that offenders regarding whom the court made soft decisions in the form of conditional convictions or early releases have a higher risk of committing recidivist crimes, in particular in complicity, and pose a higher danger to society. The obtained results can improve the understanding of the main factors associated with court sentencing decisions regarding imprisonment and provide reliable information support for legal decision-making.

eol>Court decisions information support artificial intelligence machine learning GPT-4 association rule natural language generation 1

1. Introduction

Judicial systems accumulate new massive amounts of information every year. Most legal information consists of collections of unstructured text documents. This complicates information analysis and causes data redundancy. To automate the routine activities of courts, for example, drafting standard legal documents, and increasing the efficiency of the judicial system and the quality of court decisions, courts are increasingly using AI (artificial intelligence)-based apps, data mining methods, and machine learning algorithms in their activities. Leading countries are implementing cutting-edge ICT tools to automate the procedure for reviewing applications, case management before and during trial, analytics and tracking trends in legal proceedings, identifying facts of making different decisions in similar cases, speeding up large numbers of cases, eliminating conflicts and gaps in legislation, increasing the efficiency of protecting the rights, freedoms, and interests of citizens, unity, and consistency of judicial practice. AI-based systems can be used to analyze large collections of legal documents (claims, court decisions, regulations, sentences, rulings, additional decisions, legislation, etc.). This can significantly simplify and accelerate the search for relevant information. Judicial precedents and legal texts

Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

CEUR Workshop Proceedings (CEUR-WS.org) can serve as the basis for training AI models. This will provide a foundation for reliable legal conclusions and predicting case outcomes. The results obtained can provide substantial information support for judges' decision-making. Natural language processing models can be used to analyze court decisions, rulings, verdicts, and transcripts to identify key facts and arguments relevant to a particular case. AI-based chatbots are effective for basic legal counseling and providing legal information support to citizens.

The judicial system must operate according to the rule of law. This means there should be a high degree of consistency between court decisions made in similar cases. When making decisions, courts must analyze previous precedents in a particular field of law and anticipate likely outcomes in analogous cases. Establishing the degree of connection between the circumstances of a case and the court decision is a complex non-trivial task of recent decades. Its solution could make a significant contribution to optimizing judicial policy. The problem is complicated by the existence of large collections of unstructured documents that record different forms of decisions (verdict, decree, ruling, court order) made in legal cases of different forms of legal proceedings (administrative, commercial, criminal, civil, etc.). Also important are the peculiarities of national legislation and the structure of judicial systems in different countries. Progressive countries are transforming the judicial system. One of its key elements is the digitalization of courts. Ukraine has also joined this initiative and is implementing innovative information and communication technologies (ICT) to automate the activities of courts. Within the framework of the Unified Judicial Information and Telecommunication System of Ukraine, the Unified State Register of Court Decisions (URCD) operates [ 1 ]. This is a unified automated system designed to accumulate, store, account, search, protect and provide court decisions in electronic form. The URCD makes it possible to track documents in a particular case and search for judicial practice. This is the largest database in Ukraine, containing over 115,000,000 court decisions and supporting documents, the number of which is constantly growing.

The Register is currently operating in test mode and has several functional limitations. In particular, there is no ability to extract and export document files. Access is provided to the content of unstructured texts, several attributes of which (such as qualification of criminal proceedings, characteristics of the accused person, recurrence of crime, etc.) may be described implicitly but are mandatory criteria taken into account by the court when passing sentences. Determining the severity of the crime committed by the accused, the court takes into account the qualification of criminal offenses (misdemeanor, minor crime, serious crime, especially serious crime), and the characteristics and circumstances of its commission. When discussing the issue of punishment for the accused, the court takes into account the nature and severity of the crime he committed, the personality characteristics of the accused, the mitigating and aggravating circumstances, takes into account the sanctions of the relevant articles of the Criminal Code of Ukraine [ 2 ], the risk of committing a repeated criminal offense and the risk of danger to society according to the pre-trial report, and chooses a punishment necessary and sufficient to correct the accused and prevent new criminal offenses.

Judges' assistants perform content analysis of court decisions manually. Probation officers spend a lot of time assessing the risk of recidivism and the risk of danger to society posed by the accused. With the assistance of the USID for the Supreme Court, a pilot project was developed using a GPT chatbot to recognize the texts of court decisions and compare them with relevant case law from the URCD document text database. However, primary-level courts still require innovative approaches to automate the search and analysis of relevant information in the texts of court decisions. This work aims to use the GPT-4 model's sentence-by-sentence generation technique to generate natural language and code a GPT-4 of the OpenAI API to generate the necessary knowledge from unstructured text documents in legal proceedings and develop an association rules model to identify interesting patterns and relationships in the set of criteria that are important for passing sentences in similar cases.

2. Related works

The study of different approaches to applying automated text processing technologies in the field of justice has recently attracted the attention of broad scientific circles and lawyers. The volumes of accumulated data are constantly increasing, and their analysis and compilation require the application of new technologies to substantiate and predict court decisions. AI, ML, and data mining models are some of the innovative solutions that can provide relevant tools for assessing consistency between the circumstances of the case and the court decisions made [ 3, 4, 5 ]. I. Chalkidis et al. used neural models to automatically predict the outcome of a court case based on documents describing the facts of the case. The authors analyzed the decisions of the European Court of Human Rights [ 6 ]. D. Alghazzawi et al. applied a long short-term memory network for effective prediction of court decisions based on historical datasets of court cases [7]. R.A. Shaikh et al. proposed a model for predicting the outcomes of murder cases in the Delhi District Court. Machine learning classification algorithms were used to predict the "acquittal" or "conviction" of the accused based on analysis of important legal factors for making decisions in murder cases [8]. M. Medvedeva et al. explored the possibility of using natural language processing tools to analyze court trial texts to automatically predict future court decisions. The researchers used a collection of texts on decisions made by the European Court of Human Rights for analysis [9]. L. Ma et al. considered predicting legal decisions as an important task for legal AI. The authors used a complex real courtroom dataset (plaintiff claims and court debate data) to predict court decisions through multi-task learning. The facts of the case are automatically recognized from the court debate dialogues beforehand. The proposed ML model can more accurately characterize the relationships between claims, facts, and debates [10]. C. Rocha and J. A. Carvalho studied the application of AI for informational support of judges' work and the main threats posed by this technology to the values of justice associated with their use in legal proceedings. The authors identified the following possible areas of application of artificial intelligence in automating the activities of courts: risk prediction of accused systems, document-assisted generation systems, similar cases push systems, speech-to-text applications, litigation risk assessment systems, emotion recognition systems, answering questions robots, and filtering systems [11]. K. Terzidou studied the possibilities and risks of using AI technologies for European court staff, law enforcement officers, and other participants in legal proceedings [12]. G. Rodríguez and J. David analyzed the advantages and disadvantages of using large language models (LLMs) for making judicial decisions. The researchers argued that using LLMs to develop judgment texts or make decisions during trials is problematic for judges and their clerks. The authors believed that existing LLMs are not reliable sources of information [13]. D.N. Yagamurthy et al. applied natural language generation based on AI to transform structured data into human-understandable narratives [14]. The issue of predicting and justifying court decisions based on the analysis of text documents relevant to legal proceedings is complex and requires new solutions. It is also necessary to take into account the existence of regional differences in the criteria taken into account by the court when passing a sentence in a criminal case. In addition, it has been experimentally proven that ML models are more accurate when trained on different datasets and constantly updated [15]. Ontological approaches application for knowledge shown in [26]. Previous studies using ML models were conducted based on small experimental document collections and yielded several unexpected results. It is relevant to search for new effective approaches to the analysis of texts of court decisions and relevant documents in proceedings and to develop models for assessing consistency between circumstances and facts of the case and court decisions made.

3. Methodology

The study applied a comprehensive methodological approach that combined various methods, such as literature analysis, critical analysis, comparison, case study, and the proposal of the latest IT solutions to improve the content analysis of court decisions. A systematic review of the scientific literature allowed for an in-depth study of the application of automated text document processing technologies in the field of judicial proceedings. The comparative analysis made it possible to evaluate the results of previous research on the use of innovative IT solutions for the analysis of court decision texts. The case study method was applied to analyze the content of specific legal documents. Synthesis methods, associative rule modeling, and the experimental method were used to develop an innovative approach to extracting entities, facts, and circumstances in criminal proceedings and identifying relevant information for judicial decisionmaking. The generalization method allowed for consolidating the obtained results, formulating conclusions, and recommendations, and determining further directions for improving the proposed approach. Such a combination of various scientific methods provided a systematic and thorough approach to developing an effective innovative solution based on artificial intelligence and associative rules to improve the quality and efficiency of content analysis of court decisions.

3.1. Proposed Approach

Our research work proposes an innovative approach to analyzing a large collection of texts of court decisions entered into the URCD by natural language generation (NLG) using CPT-4 and applying associative mining rules to identify non-obvious patterns between the criteria taken into account by the court when passing a sentence in a criminal case.

Our data set is generated by natural language generation using GPT-4. The flow chart for the proposed methodology is presented in Figure 1. First, the data is preprocessed and then produces strong association rules from the dataset.

3.2. Association Rule Mining

Association rules represent a fundamental concept in data mining [16, 17], focusing on uncovering patterns within data streams. Associations emerge when multiple events exhibit connections, unveiling hidden relationships within seemingly disparate datasets. These relationships are encapsulated in if-then rules, where those surpassing a specified threshold are deemed significant. Such rules enable actions based on identified patterns and aid in decisionmaking processes.

The task of association rule mining is articulated as follows: Let I = {i1, i2, …, in,} denote a set of n attributes (items), where n represents the total number of attributes. Let T = {t1, t2, …, tm} represent a set of transactions (database), where m denotes the total number of transactions. A transaction (comprising multiple simultaneous events) in D is a subset of the set I. A rule is defined as:

X  Y, (1) where X, Y  I.

Each rule comprises two distinct item sets: X (antecedent) and Y (consequent).

To identify interesting rules from the myriad of possibilities, restrictions are imposed based on various significance and interest metrics. Notably, the most renowned constraints include minimum thresholds of support and confidence.

Let X represent the itemset X  Y denotes the association rule, and T signifies the set of transactions.

Support gauges the frequency of a transaction's occurrence in the database, specifically the portion of the transaction containing both antecedent and consequent. The support X relative to T is computed as the proportion of transactions t in T containing a subset of X: whereas confidence quantifies the rule's execution frequency, indicating the accuracy of the rule. It is defined as the ratio of the number of transactions containing both the antecedent and consequent to those containing solely the antecedent. The confidence value in the rule X ⇒ Y relative to the set of transactions T is the ratio of transactions containing X and Y:

When support and confidence meet certain thresholds, it suggests a high probability that any forthcoming transaction featuring the antecedent will also entail the consequent.

Lift, also known as interest or improvement, measures the ratio of the antecedent's frequency in transactions containing the consequent to the consequent's overall occurrence frequency. Lift rules are determined by the formula: ( ) .

This ratio compares the observed confidence with the expected confidence if X and Y were independent. A lift value greater than 1 indicates a direct relationship, equal to 1 denotes no relationship, and less than 1 signifies an inverse relationship. Lift serves to further refine the set of associations by establishing a significant threshold; associations below this threshold are disregarded.

Conviction measures the implication strength of a rule, defined as: (  ) =

1 − 1 −

Conviction can be interpreted as the ratio of the expected frequency of X occurring without Y (indicating incorrect predictions) if X and Y were independent, divided by the observed frequency of incorrect predictions.

The algorithm for discovering association rules typically involves two distinct steps: 1. Utilizing a minimum support threshold to identify all item frequencies in the database (yielding frequent if-then associations). 2. Applying a minimum confidence constraint to the itemset frequencies for rule formation.

Association rule mining is a complex task. The number of possible item sets grows exponentially as the number of items increases. This exponential growth leads to algorithmic complexity when identifying frequent item sets. However, like many data mining techniques, association rules can transform massive amounts of data into a small set of insightful statistical patterns. The discovered rules reflect overall trends, not individual preferences. By uncovering connections between items within each transaction, association rules uncover valuable insights in large transactional datasets.

Association rules pose a non-trivial task, particularly as the number of items increases, leading to exponential growth in potential item sets and algorithmic complexity during frequent itemset discovery. Like many data mining techniques, this approach facilitates the transformation of vast amounts of information into a concise and comprehensible set of statistical indicators. The rules do not discern individual preferences but rather discern connections among sets of elements within each transaction.

3.3. Data selection and description

To identify non-obvious interesting patterns and relationships between the criteria taken into account by the court when passing sentences in similar cases, we analyzed 10,000 convicted sentences in criminal proceedings entered in the Unified State Register of Court Decisions [ 1 ].

To obtain relevant information (attribute values) from unstructured texts (sentences), preprocessing was performed. Its purpose is to prepare the original text collection (convicted sentences in criminal proceedings) for use as input in the association rule mining process.

We used GPT-4 to extract information from the texts of convicted sentences about the following criteria taken into account by the court when passing a sentence in a criminal case: qualification of the committed crime (offense, minor crime, felony crime or particularly serious crime); the presence of accomplices in crime (the offense was committed alone or the offense was committed in complicity); criminal reoffending (at the first time or repeatability); previous convictions (no or yes); term of imprisonment (term of imprisonment, fine, remedial works, etc.). GPT-4 is an OpenAI text generation model based on generative pre-trained transformers. As a large language model, GPT-4 generates text outputs in response to provided prompts or inputs [18]. The choice of AI model is optimal because it easily analyzes Ukrainian-language texts, while text mining models do not have dictionaries in Ukrainian.

In this article, we introduce an approach for selecting relevant information from the texts of convicted sentences in criminal proceedings by natural language generation (NLG). This is a technique for generating natural word-by-word responses based on previous context [19]. The process involves using source text documents in the query itself. The stages of the text creation process by natural language generation are shown in Figure 2.

Figure 2 shows the process starting with the original input text, representing the initial data, and going through a prompting stage where the input text is used to create a prompt. The prompt then generates the output text, identifying relevant information to form associative rules.

3.4. Rapid Miner Tool

To identify non-obvious significant patterns between a large number of diverse criteria taken into account by the court when passing a sentence in a criminal case, we applied the visual workflow designer RapidMiner Studio which includes tools for predictive analytics, data science, and machine learning [20]. Figure 3 shows the process operators that implement associative rule mining algorithms.

The constructed process includes the following operators [20]:  Retrieve Data is designed to load the initial example set into the process.  Aggregate transforms the initial example set according to the selected aggregation function (concatenation).  Rename renames the attribute to which the aggregation function has been applied.  Role defines the attribute that will be used as a unique identifier for each record of the initial example set.  FP-Grown identifies frequently occurring item sets in an initial data set.  Create Association Rules generates a set of association rules.

In Table 1, the parameters that were applied for the creation of the data mining model are presented.

Confidence is a measure of how often the created association rule is true. High confidence indicates a strong association rule.

LaPace is an estimate of the items with zero support when calculating confidence.

Gain is a measure of the strength of an association rule. Higher gain indicates a stronger association rule.

Piatetskyi-Shapiro (p-s) is a rule-of-interest measure that takes into account the base frequencies of a pair of attribute values. P-s above a limit indicates an interesting rule.

Lift is a ratio of the observed support to that expected if a pair of attribute values were independent. Values greater than 1 indicate a pair of attribute values are dependent.

Conviction is a ratio of the expected frequency of one of the pair of attribute values occurring without the other of the pair of attribute values if the pair of attribute values were independent, of the observed frequency of one of the pair of attribute values without the other value of the pair. Higher values indicate stronger rules.

4. Results and Discussion

The analysis of large volumes of court decisions allows for identifying discrepancies in the interpretation and application of legislation by different courts or even by a single court in similar cases. Such analysis assists higher judicial instances in ensuring a uniform understanding and application of laws by correcting the identified contradictions. Studying the reasoning parts of decisions makes it possible to identify areas where legislation is incomplete or ambiguous, leading to different interpretations. The analysis of court decisions is also used to track changes in judicial practice over time and the evolution of courts' approaches to interpreting legal norms. Analytical tools can be applied to assess the quality of judges' work and identify those who often make mistakes or issue contradictory decisions. The proposed innovative approach to analyzing documents from the Unified State Register of Court Decisions based on the use of modern IT solutions and advanced methods, including large language models such as GPT-4, can contribute to ensuring the unity of judicial practice, increasing the efficiency and transparency of judicial decision-making. In this particular case, GPT-4 was used to identify key facts and circumstances relevant to decision-making in criminal proceedings.

Table 2 presents examples of input original data and new output generated data.

To identify associative rules between historical crime information of convicted and repeated offenses, a data set, created based on the information extracted using an AI language model, was used, which contained the following attributes:  qualification of the committed crime: 1 - offense, 2 - minor crime, 3 - felony crime, 4 particularly serious crime;  presence of accomplices in crime: 0 - the offense was committed alone; 1 - the offense was committed in complicity;  criminal reoffending: 0 - at the first time; 1 - repeatability;  previous convictions: 0 - no; 1 - yes;  term of imprisonment (in the case of a sentence that excludes deprivation of liberty 0). The results of associative rule mining algorithms are the frequent item sets and the association rules.

Table 3 presents the frequent item sets (support  0.982). - - - - - - previous convictions - repeatability - repeatability - previous convictions repeatability - - term of punishment - term of punishment - term of punishment - previous convictions term of punishment repeatability term of punishment repeatability term of punishment

term of previous convictions repeatability punishment Source: compiled by the authors

The created associative rule mining model made it possible to identify the following nonobvious patterns observed when passing sentences (judicial decisions) in criminal proceedings: 3. Persons committing crimes in complicity, in most cases, had previous convictions and/or committed crimes in the past (support = 0.996). 4. Persons sentenced to imprisonment, in most cases, committed crimes in complicity and/or had previous convictions and/or committed a repeated crime, in most cases (support = 0.982).

This means that soft court decisions for persons who committed minor crimes for the first time create an illusion of impunity for offenders. Conditional convictions and early releases are perceived by convicts not as a chance for correction but as another opportunity to commit a new crime and not serve the full term of punishment. Penitentiary institutions do not yet make offenders virtuous people but only isolate them for the sake of public safety. Persons who have passed "criminal institutions", in most cases, become members of criminal groups.

The 486 association rules were detected. The 15 following association rules are strong (confidence = 0.987): [complicity] --> [previous convictions, term of punishment] (confidence: 0.987) [previous convictions] --> [complicity, term of punishment] (confidence: 0.987) [complicity, previous convictions] --> [term of punishment] (confidence: 0.987) [complicity] --> [repeatability, term of punishment] (confidence: 0.987) [complicity, repeatability] --> [complicity, term of punishment] (confidence 0.987) [previous convictions] --> [repeatability, term of punishment] (confidence: 0.987) [repeatability] --> [previous convictions, term of punishment] (confidence: 0.987) [previous convictions, repeatability] --> [term of punishment] (confidence: 0.987) [complicity] --> [previous convictions, repeatability, term of punishment] (confidence: 0.987) [complicity, previous convictions] --> [repeatability, term of punishment] (confidence: 0.987) [previous convictions] --> [repeatability, term of punishment] (confidence: 0.987) [repeatability] --> [complicity, previous convictions, term of punishment] (confidence: 0.987) [complicity, repeatability] --> [previous convictions, term of punishment] (confidence: 0.987) [previous convictions, repeatability] --> [complicity, term of punishment] (confidence: 0.987) [complicity, previous convictions, repeatability] --> [term of punishment] (confidence: 0.987)

Association No. 3, 6, 9, and 15 are not associative rules, since lift = 1. It means that antecedent and consequent are independent. The other defined associative rules are strong with high support = 0.982 and high confidence = 0.987 (Table 4). complicity, previous 12 repeatability convictions, term of 0.982

punishment complicity, previous convictions, 13 0.982 repeatability term of punishment

The next network diagrams of rules produced for the term of punishment visualize the identified strong associative rules. Thus, the appointment of punishment in the form of imprisonment is associated with the fact of committing a crime in complicity, the presence of previous convictions of the accused, and the repeated commission of a crime (Figure 4).

Criminal offenses qualified as particularly serious crimes and felony crimes did not enter the identified strict rules. This result can be explained by the fact that particularly serious crime and felony crimes make up an insignificant part of others, and the court does not make decisions to impose imprisonment for offenders.

The developed data mining associative rule model can explain the identified associative rules. For example, felony crimes are not associated with complicity, repeatability, previous convictions, and term of punishment (Figure 5). In particular, most felony crimes are committed by defendants who did not have previous convictions, committed a criminal offense for the first time, and without accomplices. In most cases, sentences not related to imprisonment were passed (community service, fines, etc.). It can be assumed that as a result of the leniency of previously passed minor offense sentences, they felt the humanity of the judicial system and in the hope of impunity continued their criminal activities.

The results confirm the estimates obtained in previous articles [ 4, 5, 21 ]. Previous offenses left unpunished, unfinished terms of punishment are the main factors that shape convicted persons' propensity to commit repeated criminal recidivism. The identified patterns can be used to calculate the risk of a repeat criminal offense by the accused in criminal proceedings and the risk of danger he poses to society. The knowledge gained can provide the judiciary with information relevant to passing a sentence in criminal proceedings. For example, regarding the appropriateness of setting a probationary period or the expediency of conditional early release, choosing a preventive measure before the sentence comes into legal force, etc.

This document is part of interdisciplinary research on the application of data mining, machine learning, and artificial intelligence to develop a unified court decision support system. In previous studies, a factor model was proposed to identify consolidated factors formed based on data on previous offenses of the accused. A machine-learning algorithm was presented to determine the personal characteristics of convicts that influence the propensity for criminal recidivism [21]. A binary logistic model was constructed to predict the probability of criminal recidivism by convicts [22].

The problem of developing optimal approaches to selecting methods for predicting the outcomes of legal proceedings is non-trivial and can simplify understanding the essence of the decision-making process. When passing a sentence, the court takes into account many facts in the case. For example, the qualification of the proceedings, the legal factors specific to a particular case, the types of evidence, the characteristics of the accused, the presence of previous convictions, the repeat offense, etc. Details of the criteria (facts) concerning a particular case are stored in court decisions. However, extracting these facts from legal texts is a laborious, complex, and time-consuming process. Therefore, most studies of this type are conducted on small datasets and concern only regional studies and certain types of proceedings. The researchers R.A. Shaik et al. identified factors that have a significant impact on the outcomes of murder cases. The studies were conducted based on 86 cases from the Delhi District Court. To predict the result of binary classification for the classes “acquittal” and “conviction” of the accused, conventional ML classification algorithms were used. Cross-validation Leave one-out was performed to obtain results. Factors important for decision-making are extracted through manual reading and analysis of court decisions, which is a complex and long process [8]. The authors H. Aissa et al. used ML to predict the outcomes of accident cases based on 514 court decisions from the Errachidia Court in Morocco. By manually reading the decisions in the case, the authors extracted features based on the most representative characteristics previously identified as affecting accident findings [23]. features of the development of systems based on content analysis are given in the work[24]. J.F.M. Soro and C. Serrano-Cinca analyzed factors explaining the court's decision to grant child custody. The authors developed a neural network model to predict the court's verdict based on 1884 court decisions. The research group read and analyzed the content of each court verdict and identified the necessary facts, legal principles, and other information relevant to the court decision. Although the criteria taken into account by the court in making a decision were pre-agreed, numerous discrepancies arose in identifying factual elements and legal principles. To ensure the quality of the process of extracting the necessary facts from the texts of sentences, a leading researcher was additionally involved [25]. In any case, obtaining the criteria (facts, circumstances) taken into account by the court in passing sentences in cases from the texts of sentences was a laborious, expensive manual process.

Our current research aims to identify valuable patterns in the set of criteria that are important for passing court sentences and strong associative rules between the facts of criminal proceedings and the sentences passed by the courts. The research data set consists of 10,000 sentences passed in criminal proceedings by courts in Ukraine. An innovative approach is proposed that combines the use of data mining tools and the GPT-4 language model's sentenceby-sentence generation technique to generate facts from unstructured textual documents of court decisions that make up the initial data set. Compared to similar studies by other authors, we use GTP-4-based sentence-by-sentence data generation for further application of associative rules mining. Such an approach of automatic content analysis and data generation significantly saves the efforts and time of the court, the legal profession, and prosecution staff and provides a higher quality of the data set by reducing so-called human errors.

5. Conclusions

Content analysis of court decision texts is important for identifying non-obvious interesting connections and interdependencies between the circumstances of the case and the results of the trial. This can improve the consistency of judicial decisions and facilitate the analysis of outcomes in similar cases. However, this is a complex non-trivial task that requires the development of new approaches and the selection of the best solution methods. Such studies have a regional aspect and have a clear subject focus. When extracting the necessary knowledge from a collection of texts of court sentences, it is necessary to take into account the peculiarities of national legislation and specific comparison criteria. For example, the form of legal proceedings (administrative, commercial, criminal, administrative, civil, etc.), and the subject of similarity of cases (cases related to murder, crimes against minors, custody cases, etc.). Most previous studies on this issue perform the stage of identifying relevant criteria taken into account by the court in passing sentences in similar cases manually. This limits the volume of created datasets used for further analysis and reduces the reliability of the results.

This article proposes an innovative approach that combines the use of the GPT-4 language model for generating facts from court decision texts and methods of associative rule mining to identify patterns between the criteria considered when rendering verdicts. Based on the analysis of 10,000 texts of criminal verdicts in Ukraine, frequent item sets of criteria (support ≥ 0.982) associated with judicial decision-making and strong association rules (confidence = 0.987) between case facts and outcomes have been identified. It has been established that individuals who commit crimes in complicity, have previous convictions, or committed repeat offenses, generally receive real prison sentences. Individuals against whom lenient measures were applied (probation) more often commit repeat crimes, indicating their higher public danger. The revealed knowledge can be used to assess risks and provide informational support for judicial decisionmaking, increasing their validity. The proposed model can be useful for probation officers in assessing the risk of repeat criminal offenses and the danger posed by the accused to society. The obtained information can ensure transparency and comparability of decisions made and be valuable for the judiciary, advocacy, prosecution, and other participants in the judicial process.

The proposed approach allows for automating the analysis of large arrays of court decision texts and generating data for further application of data mining methods. Effective prediction of court decisions in similar cases can facilitate understanding of judicial decision-making, provide reliable support to decision-makers, and promote the rule of law. The subject of our further research will be the search for the best data science, ML, and AI methods to identify hidden factors associated with the formation of criminal groups based on content analysis of court decision texts in relevant cases.

Acknowledgments

The authors express their sincere gratitude to the Armed Forces of Ukraine for providing security, which made it possible to conduct our research.

[1] Unified State Register of Court Decisions of Ukraine . 2024 . URL: https://reyestr.court.gov.ua/.

[2]

Criminal

Code of Ukraine. 2024 . URL: https://zakon.rada.gov.ua/laws/show/2341-14#Text.

[3]

Li ,

Zhao ,

Nai ,

Tao . Charge prediction modeling with interpretation enhancement driven by double-layer criminal system . World Wide Web 25 ( 2022 ) 381 - 400 .

[4] K.M. Berezka , O.

Ya . Kovalchuk, S.V.

Banakh , S.V.

Zlyvko , R.

Hrechaniuk . A binary logistic regression model for support decision making in criminal justice . Folia Oeconomica Stetinensia 22 ( 1 ) ( 2022 ) 1 - 17 . doi: 10 .2478/foli-2022-0001.

[5]

Kovalchuk ,

Banakh ,

Masonkova ,

Berezka ,

Mokhun ,

Fedchyshyn . Text Mining for the Analysis of Legal Texts . Proceedings of the 12th International Conference on Advanced Computer Information Technologies ( 2022 ) 502 - 505 .

[6]

Chalkidis , I. Androutsopoulos,

Aletras . Neural Legal Judgment Prediction in English . arXiv ( 2019 ). doi: 10 .48550/arXiv. 1906 . 02059 .