Judith Michael, Victoria Torres (eds.): ER Forum, Demo and Posters 2020 31 Conceptual Model-driven Legal Insights for Stakeholder Decision Making Sagar Sunkle, Krati Saxena, and Vinay Kulkarni Tata Consultancy Services Research, Pune 400011, India sagar.sunkle|krati.saxena|vinay.vkulkarni@tcs.com Abstract. The applicant(s) and the defendant(s), i.e., the parties in- volved in legal cases, often lack knowledge about the scope of dispute and how it affects their/courts’ decision making. Legal professionals too would like to get a consensus view of the past cases and interpret the facts of the case for their clients. Existing work in the area proposes sum- marization and legal entity extraction from similar past incidents with the interpretation left to the respective parties. In contrast, we present an approach that correlates the facts of the cases, the verdicts, and the reasons behind those verdicts in the form of easily consumable insights. We propose to use two conceptual models- first representing a meta- model of legal case insights common across various legal domains and the second, a set of conceptual models of a specific legal domain under consideration. We use these models to inform case categorization and user profiling. Then, using a combination of natural language process- ing and machine learning techniques, we extract profile-specific insights across related available past cases. We demonstrate the utility of our ap- proach using examples from two disparate legal domains, parental alien- ation cases and divorce cases with promising results, including how such insights can aid decision-making of the user. Keywords: Legal Insights · Decision-Making · Conceptual Modeling · Legal Domain · Clustering 1 INTRODUCTION The parties involved in the court cases depend on legal professionals for the knowledge about their cases. Usually, lawyers provide insights on cases based on their understanding and legal knowledge. Searching from a previous case database and understanding how those cases help or relate to the case in hand is a tedious job. The existing systems provide information from previous cases in the form of summaries or legal entities, interpretation of which requires proper legal un- derstanding [2,3,5,6,9,13]. Non-legal users cannot use these systems due to the legalese and a general lack of awareness around statistics about specific cases. Ontology-based systems in the legal domain demonstrate promising research towards creating legal ontologies [11,7,1,4]. Ontologies on their own do not con- tribute to the sense-making for an end-user. With these outputs at hand, users Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 32 Sunkle et al. still need to check the case file if particulars are needed, which is a time and effort-intensive process. Fig. 1: Modeling and Generation of Insights for Legal Cases We propose a (semi-) automated system to obtain insights from past legal cases. The involved parties can use this system to get a better awareness of previous similar cases on their own. Such a system has the potential to help them in decision-making to maximize their benefit in the legal process. On the other hand, legal professionals can use this system to profile their clients or get general insights on what course to follow in an ongoing or a new case based on their client’s circumstances. We present a metamodel of legal cases geared towards insight categoriza- tion and identification. The metamodel guides the formation of a set of concept models of a specific legal domain. We use these concepts models to create a cat- egorization of concepts and a set of user profiles. To retrieve insights specific to those profiles, we use text searching techniques to pinpoint specific information that contributes to the insights. Our overall approach is illustrated in Figure 1. Our specific contributions are twofold: – With limited manual intervention, our system generates statistical insights from past cases in a manner that is easily understandable for both legal and non-legal stakeholders. – The proposed metamodel assists a scheme of categorization generic enough to apply to any legal domain while being capable of aiding decision-making of legal stakeholders. Conceptual Modeling for Legal Insights 33 We organize the paper as follows. We detail our approach in Section 2. In Section 3, we describe cases from two legal domains, namely parental alienation and divorce cases. Section 4 applies the system to case datasets of the two legal domains and discusses results. We review the related work and conclude the paper in Section 5 and 6, respectively. 2 Modeling and Generating Legal Insights In Figure 1, we refer to the steps involving the generation of a set of concept models and categorization of metamodel components as information modeling. The text processing covers various NLP and ML techniques that we use to process the text of the past cases. The statistical insights system generates insights by extracting and analyzing relevant spans of text based on the user profiling system to produce insights. We describe each of these steps in detail next. Information Modeling All legal cases deal with conflicts between the parties. The applicant/appellant/plaintiff presents a complaint or appeal against the de- fendant/respondent. Our observations suggest that legal cases mainly comprise of five components: the parties involved, the previous verdicts on (an ongoing) case(s), the facts related to the parties, the appeals made by the parties and the court verdict. These components lead to a simple conceptual metamodel shown in Figure 2. involves LedTo comprise consider resultIn basedOn Fig. 2: Metamodel for Legal Case Insights We observe that different countries/states/cities may record the cases with different sections, and those sections may overlap in the indicative content. Our approach requires that the user maps the specific set of sections in the cases under consideration to the above list of sections. We create a concept model for each of the sections present in the case data using the metamodel in Figure 2 as the guide. For the aided creation of concept model based on our approach presented in [12], we collate the text for each section from all the past case files available. As described in [12], the human-in- the-loop creation involves starting with a seed concept and using the provided user interface to choose in an ongoing manner the next set of concepts in the model from the suggestions offered. It is in this step that the expert, cognizant 34 Sunkle et al. of the metamodel, chooses concepts in sync with it. The expert selects a concept and also adds relevant mention(s). A mention is any reference to the concept in the text. At the end of the categorization exercise, we have a categorization dictionary. We show example concept models for both case studies in Figures 3 and 4 and categorizations in Figure 5 in later sections. User Profiling As indicated earlier, we bind the categorizations to both the construction of a user profile and the generation of insights. For each category type such as parties or facts, we create simple wh- questions, the options for which are the category mentions in the categorization dictionary. For instance, the parties category leads to the question who are the party involved in the case? and the options presented would be all the mentions of the parties category. We provide examples of profiling questions and options based on the category in Figure 5 in Section 3. Text Processing To generate insights from past cases, we need to process the text of the cases. Usually, the legal text contains long and complicated sen- tences with complex syntactical structure. We apply standard text normalization and segment the text into sentences1 . The domain expert creates a pattern dictionary from the sentences which is used to parse the sentences to generate the statistics. Pattern dictionary contains the text patterns which are indicative of the category mentions. We aid the determination of these patterns in the sentences using clustering. We use TF- IDF vectors to vectorize the sentences. Then, we create embedded text from the vectorized text by dimension reduction using principal component analysis2 . Finally, we apply K-Means clustering3 to the embedded text with k=10. The expert identify the patterns in the sentences and create pattern dictionary. We show example text patterns in Figure 6 in Section 3. Insights Computation We store the text in a Python dataframe. For each user profiling question, the user chooses the options from category mentions. We show the generated questions for both case studies in Figure 7 in Section 3. We parse the text in the dataframe by matching patterns from the pattern dictionary for the selected options of the category mentions. We compute the statistics as counts and percentages of each category in the text spans. In Section 4, we show the results for instances of two case studies in Figure 8, wherein a party is the user and has chosen specific options to the set of questions that the system presents. In the next section, we begin by describing the two case studies under con- sideration. 1 Spacy sentence boundary identification https://spacy.io/usage/spacy-101#features 2 Principal component analysis in sklearn https://scikit-learn.org/stable/modules/ generated/sklearn.decomposition.PCA.html 3 K-Means clustering https://scikit-learn.org/stable/modules/generated/sklearn. cluster.KMeans.html Conceptual Modeling for Legal Insights 35 3 Case Studies In the following, we briefly describe the two case studies, namely parental alien- ation (PA) and divorce cases from Dutch civil court. The reason we choose these legal domains is that both parental alienation and divorce cases considerably af- fect the social and emotional well-being of the involved parties. A system of legal insights can be beneficial in aiding the parties to understand the characteristics of such cases, including especially for parties like an alienated parent or a spouse with a bad marriage. amend orders case presides over court has announce determine verdicts case_category previous_verdicts parties bears is facts are exercise Parental_Alienation_case has has has facts_category applicant stakeholder reasons_of_verdicts previous_verdict_category defendant is custody has makes child reasons_category appeal placed with placed under costs has investigation foster_care supervision appeal_category child_ arrangement_between_party_and_child placement Fig. 3: Concept Model for PA Cases: Colored nodes are indicative of metamodel, white nodes represent PA specific nodes For want of space, we show a single curtailed concept model for PA and divorce cases in Figures 3 and 4 respectively. We make available selected artefacts for both cases including the section-wise concept models of PA cases4 . We first translate the Dutch legal text into English language text using Google translate API5 . We identify the sections present in the files and extract and collate the text for each section. Using this text, the domain expert create concept models for each section. Parental Alienation Cases Parental alienation (PA) is a situation where the child gets enmeshed with a preferred parent and rejects the relationship from the other parent without legitimate justification [10]. 4 Selected artefacts available at https://github.com/sjrddjtmhkm/cmlcisdm data 5 Google translate API https://translate.google.com/ 36 Sunkle et al. The concept models of various sections inform us that the parties involved in PA cases are father, mother, child, institutions (like youth care centres), foster parents, relatives, and court councils. PA cases revolve around custody of chil- dren entrusted to parents, foster care or institutions, the arrangement of contact or visitation between parties and the children, the residence of the children and placement of children under supervision. The identified categories are shown in Figure 3. Divorce Cases Divorce or dissolution of marriage is the process of termi- nating a marital union. The divorce cases include parties such as man, woman, child, institutions, councils, bank, curator, and notary [8]. They also include marriage/divorce disputes, division of marital property, payments such as child support, alimony, living expenses and other disputed costs and income, agree- ments and settlements between the parties and parental plan for the children. If the spouses have children, the divorce cases may include PA topics like the custody of children entrusted to parents, foster care or institutions, the ar- rangement of the contact or the visitation between the parties and the children, the residence of the children, and placement of children under supervision. The court presides over these cases, orders investigation on the current situations, an- nounces decision on the appeals and orders the required action from the parties. The identified categories are also shown in Figure 4. pronounces case presides over court has announce verdicts case_category previous_verdicts agreement is made between facts has Divorce_case applicant has facts_category defendant has stakeholder reasons_of_verdicts previous_verdict_category are parties has makes is hold get_or_pay reasons_category pay appeal nationality exercise has appeal_category personal_costs legal_costs child_custody _and_income divides living_expenses divorce relates to child_support_cost between creates conflicts for matrimonial_property child supervision has contact with placed under Fig. 4: Concept Model for Divorce Cases: Colored nodes are indicative of metamodel, white nodes represent divorce specific nodes Conceptual Modeling for Legal Insights 37 Use Profiling Questions and Insights Generation We show the schematic of question generation in Figure 5. As indicated earlier in Section 2, we imple- ment a simple template-based question generation for these concepts. For the case category, the rule is to add What is in front of the concept. For the who type of questions, we add who are followed by the concept, followed by involved in the case?. For the rest of the what type of questions, we add what are the followed by the concept followed by applicable in your case?. This processing gives us profiling questions. We use the resultant list of questions and the categorization dictio- nary to obtain responses from the user to create their profile. In Figure 5, on the right, we show the categorization dictionary for both PA and divorce cases. We generate the count-based results from the normalized and segmented text by rule-based parsing. The rule-based parsing involves the matching of the patterns from pattern dictionary for category mentions chosen by the user. Fig. 5: Profiling Questions and Insights Generation Using Categorization As discussed earlier in Section 2, the domain expert obtain the pattern dictio- nary using clustering and principal component analysis. Examples of identified patterns for a few cluster mentions are presented in Figure 6. 38 Sunkle et al. Fig. 6: Example Text Patterns identified In PA and Divorce Case Files 4 Results and Validation Our case datasets include 109 and 102 case files for PA6 and divorce7 cases, respectively. As described earlier in Section 2, we present a set of questions to the user(s) who select(s) options applicable to their situation (as in Figure 7). Figure 8 shows the results obtained for each case study based on the chosen options in Figure 7. 4.1 Insights for Profile-specific Questions Case category The first question asks the user to input the case category. The options include all the legal domains for which the data is available and processed, in this case, PA and divorce. The resultant statistics show the parties and their occurrence in selected case category. It also shows the top group of parties that appear together in the cases. In Figure 8(a) and 8(b), the top-left output shows the count of the parties that are the general participants in PA or divorce cases. Parties contesting the case Next question is Who are the parties involved in the case?. Based on the option chosen by the user, the output shows the statistics and the text spans of previous verdicts of the cases where the parties were the options chosen by the user. The top-middle output in Figure 8(a) and 8(b) shows the result for inputs mother, father, institution and council in PA cases and man, woman and council in divorce cases. The statistics show what are the portions of each category in 6 Sample file available at https://uitspraken.rechtspraak.nl/inziendocument?id= ECLI:NL:GHAMS:2019:44, more files available with other nomenclatures. 7 Divorce cases https://jure.nl/echtscheiding Conceptual Modeling for Legal Insights 39 Fig. 7: Options Selected by the User for the Profiling Questions the previous verdicts in related cases and what parties are involved in what kinds of previous verdict categories. Previous Verdicts The third question is about choosing previous verdicts, if applicable. User can select None if the user has a new case. Otherwise, the user chooses other options applicable to his/her case. The top-right output in Figure 8(a) and 8(b) shows the output for question about previous verdicts. Here, the user chooses residence of the child, arrangement between the party and the child in PA cases. For divorce cases, the choices were divorce dispute, parental plan, costs and income. The output is text spans of facts and statistics on the facts where previous verdicts were about the options mentioned above chosen by the user. Facts related to Parties The fourth question is about the facts of the cases. The user chooses the facts applicable to their case. Here, the user chooses res- idence and arrangement in PA cases. For divorce cases, the user has chosen divorce, cost, and income. These set of choices outputs the text spans of appeals and requests made by the parties. The bottom-left output in Figure 8(a) and Figure 8(b) show the percentage of each category spanned by the appeals. Appeal made by Parties The final question asks the user to input the appeal applicable in their case. Here, the user selects the option of supervision, cus- tody, and arrangement in PA cases and property, costs, and previous verdict in divorce cases. Based on the input, the system shows statistics and text of court decisions/verdicts in selected cases and the reasons for those verdicts as shown in bottom-middle and bottom-right outputs of Figure 8(a) and 8(b). 40 Sunkle et al. Fig. 8: Statistical Insights (a) Parental Alienation and (b) Divorce Cases from Profile- specific Options Selected in Figure 7 Conceptual Modeling for Legal Insights 41 4.2 Using Insights in Decision-making The user can use the above set of results in interpreting similar previous cases and in decision-making as follows: Case Category A person who is searching for insights on previous similar cases gets to know what other parties are involved in a legal domain, which they may not have considered. For example, a parent going to contest a new case gets to know that there are institutions, guardians and council involved in PA cases. Similarly, a man or woman contesting divorce case gets to know that curator and bank may be involved in the case. So they can prepare for situations where other parties may get involved which the user previously did not consider, due to the court’s decision. Parties These results show the statistics on previous verdicts. These can help a user by conveying the possibilities of the court’s action. If a case has more than one hearing, the previous verdicts show the categories where the court’s decision gets pending for review the most. For example, in the top middle output of Figure 8(a), user can understand that placement of a child with an institution, arrangement and previous requests get pending for a review mostly. Similarly, in the top middle output of Figure 8(b), divorce/marriage disputes and residence of the child gets pending for the review. Previous Verdicts The statistics show the user which general facts are relevant in a particular domain. The user can decide how to present the facts to the court based on what was presented in the previous cases. The parties in the statistics on facts also shed a light on which facts are essential corresponding to each party. Statistics in the top right output of Figure 8(a) for PA cases show that mother and child are involved in all the fact categories in this case. It means that they play an essential role in the case. Other important facts shown relate to these parties like the residence of the child is with mother and mother is charged with the custody of the child in these cases. Similarly, in the top right output of Figure 8(b) for divorce cases, most of the facts categories are about the man. This result shows that facts such as costs and income of a man are crucial facts. Similarly, the child’s residence is another important fact for the selected cases. Facts related to the Parties We can observe in the bottom left output of Figure 8(a) for PA, previous verdicts announced by the court makes for 57.1% of the total appeals. The appeal related to the placement of the child under supervision is another major area of appeal with 20%. In the bottom left output in Figure 8(b) for divorce cases, appeals about costs, previous verdict, property and residence spans 31.2%, 25%, 12.5% and 12.5% respectively of all the appeals. This result tells the user about the majority of appeals in the selected data. By looking at the text spans of these appeals, the user can understand what exact appeals were made for these categories and how they can be presented for the current case. All the statistics mentioned above are simple ways to comprehend what hap- pens in the cases. 42 Sunkle et al. Appeals made by Parties The final result in bottom middle and bottom right outputs of Figure 8(a) and Figure 8(b) informs the user about their standing in the case. Based on the options chosen in the previous questions, this result informs the user about how many times the appeal was ratified, annulled or left pending for review. 50% of the times the decision get annulled for both PA and divorce cases in Figure 8. For PA, 43.8% decisions get ratified, and rest is left for review. For divorce, there is no pending decision; the remaining 50% cases get ratified. Reasons of the verdict are also shown in bottom right outputs of Figure 8(a) and Figure 8(b). Reasons can advise the user about all the steps a user can take to get their appeals ratified. For example, in Figure 8(a) for PA cases, the statistics show that previous judgments, communication between parties and child development and well-being are some of the critical points in decision- making by the court constituting 27.6%, 14.41% and 11.57% each respectively. In Figure 8(b) for divorce cases, the main reasons for the verdict include previous judgement, costs, property and income of the parties spanning 38.1%, 21.03%, 14.02% and 5.54% of all the reasons. So the user can develop a case around these points to get a favourable result. User can present evidence on these categories that prove their case or disproves the opponent’s case to maximize their benefit. 4.3 Limitations and Future Work In the following, we briefly discuss the limitations of our approach and way forward. – The output shows statistics and also provides an option to view all the text spans about a chosen categorization. The text output may contain several text spans and may become cumbersome for the user to go through. In fu- ture, we plan to extract more fine-grained information from the text spans. For example, the residence category will return all the sentences related to residence. Still, the user may require specific information such as the exact location for the residence of the child. Extracting such fine-grained infor- mation is non-trivial. We can train structured prediction machine learning models, but such models require extensive annotated data. – Users esp. non-legal users may find pieces of information difficult to compre- hend even though contextualized to their profile. In ongoing work, we plan to present recommendations that suggest a possible course of action to the user. As an example, consider possible decision making suggestions from Section 4.2 based on case category and parties. In the examples we provided, we sug- gest the user (partaking in a divorce case) become aware of the involvement of alternate parties and (in case of PA cases) the fact that the said cases get pending for review. We can prepare an individual recommendation for the first example as follows: your case may involve bank and curator as alternate parties; it is recommended to enquire and plan about the inclusion of such parties. For the second example, the recommendation would be, your case involves a child with an institution. Such cases get pending for a review; Conceptual Modeling for Legal Insights 43 it is recommended to include this in defence or plan for the duration. We can treat parts of these recommendations as sections of a template in which certain blanks are filled with the contextualized information. By combining all of the recommendations for a particular case, it is possible to create a narrative applicable to the case considering the case categories, the parties, previous verdicts, and facts and appeals. 5 Related Work Ontologies in legal domain Ontologies and modelling in legal domain is widely researched topic. Several papers describe processes to create legal on- tologies [11] and capture the legal knowledge in several ways. Most work on legal ontologies focuses on designing part of the ontology and models and how they can be used for tasks such as document classification, search and retrieval tasks. Alternatively, our approach provides a way to create conceptual model based on legal case metamodel and further use it to generate text spans and statistics based on the user profile. Text Summarization Considerable research exists on legal text summa- rization [9]. Farzindar et al. [6] created LetSum system which uses a linguistic approach to extract summaries in the form of tables from Canadian legal cases. Using this, Chieze et al. [5] developed DecisionExpress which can extract in- formation related to judges, tribunals, subject of the information, conclusions of judgment and a brief description of a case in both English and French for Canadian legal cases. Most legal summarization works produce summaries which could be consid- ered a coarser version of insights that our system generates. Besides creating summaries of entire documents, these works do not distinguish aspects corre- sponding to our metamodel. Finally, unlike our approach, these works do not enable customizing the summaries for a given user profile. Information Extraction Most legal information extraction work focuses on techniques for extraction rather than a conceptual scheme for arranging the ex- tracted information in a manner useful for decision-making. For instance, several works focus on document segmentation, and legal entity extraction [3], legal text indexing and argument extraction [13], and legal event extraction [2]. We distinguish our approach from the works cited above in terms of the con- ceptual framework that our approach offers. Besides, the focus of our approach is on generating easily understandable insights even for non-legal users, including parties involved in the legal cases. 6 CONCLUSION Past legal cases can provide insights such as, which parties generally get involved, for specific facts what were the previous verdicts, and which reasons cause given decisions. We presented a conceptual model-driven approach that enables both non-legal parties and legal professionals to obtain such insights by responding 44 Sunkle et al. to a set of profiling questions. The users can consider these insights for decision- making. Our results on two case datasets on parental alienation and divorce cases prove the utility of our approach. We continue to work on extracting and presenting more fine-grained legal insights. References 1. Alexander, B.: Lkif core: Principled ontology development for the legal domain. Law, ontologies and the semantic web: channelling the legal information flood 188, 21 (2009) 2. de Araujo, D.A., Rigo, S.J., Barbosa, J.L.V.: Ontology-based information extrac- tion for juridical events with case studies in brazilian legal realm. Artificial Intel- ligence and Law 25(4), 379–396 (2017) 3. Bommarito, I., Michael, J., Katz, D.M., Detterman, E.M.: Lexnlp: Natural lan- guage processing and information extraction for legal and regulatory texts. arXiv preprint arXiv:1806.03688 (2018) 4. Breuker, J., Elhag, A., Petkov, E., Winkels, R.: Ontologies for legal information serving and knowledge management. In: Legal Knowledge and Information Sys- tems, Jurix 2002: The Fifteenth Annual Conference. pp. 1–10 (2002) 5. Chieze, E., Farzindar, A., Lapalme, G.: An automatic system for summarization and information extraction of legal information. In: Semantic processing of legal texts, pp. 216–234. Springer (2010) 6. Farzindar, A., Lapalme, G.: Letsum, an automatic legal text summarizing system. Legal knowledge and information systems, JURIX pp. 11–18 (2004) 7. Fawei, B., Pan, J.Z., Kollingbaum, M., Wyner, A.Z.: A semi-automated ontology construction for legal question answering. New Generation Computing 37(4), 453– 478 (2019) 8. Hetherington, E.M.: Divorce: a child’s perspective. American psychologist 34(10), 851 (1979) 9. Kanapala, A., Pal, S., Pamula, R.: Text summarization from legal documents: a survey. Artificial Intelligence Review 51(3), 371–402 (2019) 10. Kelly, J.B., Johnston, J.R.: The alienated child: A reformulation of parental alien- ation syndrome. Family court review 39(3), 249–266 (2001) 11. Soares, A.A., Martins, P.V., da Silva, A.R.: Legallanguage: A domain-specific lan- guage for legal contexts. In: Enterprise Engineering Working Conference. pp. 33–51. Springer (2019) 12. Sunkle, S., Kholkar, D., Kulkarni, V.: Comparison and synergy between fact- orientation and relation extraction for domain model generation in regulatory compliance. In: Conceptual Modeling - 35th International Conference, ER 2016, Gifu, Japan, November 14-17, 2016, Proceedings. Lecture Notes in Computer Sci- ence, vol. 9974, pp. 381–395 (2016). https://doi.org/10.1007/978-3-319-46397-12 9, http://dx.doi.org/10.1007/978-3-319-46397-1 29 13. Wyner, A., Mochales-Palau, R., Moens, M.F., Milward, D.: Approaches to text mining arguments from legal cases. In: Semantic processing of legal texts, pp. 60– 79. Springer (2010)