Stephan Ralescu and Anca Ralescu MAICS 2017 pp. 29–32 Is there a place for Machine Learning in Law? Stephan Ralescu Anca Ralescu CETANA, LLC Senior Member IEEE ralescu@gmail.com EECS Department, machine learning 0030 University of Cincinnati Cincinnati, OH 45221, USA Anca.Ralescu@uc.edu Abstract moreover, that a formal treatment may be used towards this end. Research in artificial intelligence and law goes back ap- proximately 40 years. It remains largely based on formal The paper is inspired by work on logic-based formaliza- logic, including non-monotonic logic, case-based reasoning, tion of legal reasoning, (Prakken and Sartor 1996), (Prakken and logic programming. However, some researchers in and and Sartor 1997), (Prakken and Sartor 1998), (Prakken and practitioners of law have argued in favor of quantitative ap- Sartor 2002), (Sartor 2002), as well as by ideas from (Tillers proaches (e.g. probability) to account for uncertainties in le- 2011) (Tillers 1993), (Franklin 2012) making the case for gal arguments. Other researchers have pointed some of the continuous mathematics tools (probability, mathematical ev- shortcomings of the current artificial intelligence and law re- idence, fuzzy sets and logic), and the promise2 that machine search, e.g. inability to take context into account. At the learning holds for law. An example of a predictive system same time, machine learning has made huge inroads in many can be found in (Campbell et al. 2016), for the restricted area different fields and applications, and therefore, the question of patent law. is whether machine learning has anything to offer to the the- ory, and, equally important, the practice of law. As a position As in many areas of research, ranging from science, en- paper, this is a preliminary study towards the exploration of gineering, medicine, and social sciences including the le- a synergistic integration of current artificial intelligence ap- gal field, artificial intelligence has brought about possibili- proaches in law, with machine learning approaches. It puts ties, which excited some, intrigued others. Pioneering work forward the idea that formal, logic-based approaches, cur- done by Edwina Rissland and her students and collabo- rently very popular the Artificial Intelligence & Law research, rators, (Rissland and Skalak 1991), (Skalak and Rissland could benefit from an extension with a machine learning com- 1992), and Hafner and Berman (Hafner 1978), (Berman and ponent, and discusses some ways in which machine learning Hafner 1993) has gone a long way towards understanding could be integrated into these approaches. the promise and challenges that face formalization, with goal of developing a computer system, of legal reasoning. Introduction When it comes to machine learning and law, there are Artificial intelligence and law two, quite unrelated, directions of study. On one hand, researchers are interested in legal issues raised by the re- Formal logic is the approach of choice for artificial intel- search in machine learnig. For example, the symposium ligence and Law, as evidenced by a wealth of articles, in- Machine Learning and the Law, held in conjunction with cluding those already mentioned, and others, (Bench-Capon NIPS-20161 , had as goal to ”explore the key themes of pri- 1997), (Prakken and Sartor 1996), (Prakken and Sartor vacy, liability, transparency and fairness specifically as they 1997), (Prakken and Sartor 1998),published in a series of ar- relate to the legal treatment and regulation of algorithms tificial intelligence journals, including the specialized jour- and data. On the other hand, the second direction is that of nal of Artificial Intelligence and Law3 . A critical review of actually use of machine learning in law research and prac- the logic-based approach can be found in (Prakken and Sar- tice. tor 2002). Stimulated on one hand, by progresses, as well as by Moreover, analyzing the current results, Franklin shortcomings of artificial intelligence approaches in law (as (Franklin 2012) lists challenges, not yet met by current ar- perceived by various researchers), and on the other hand, by tificial intelligence approaches in law, for formalization of the tremendous recent machine learning succeses in many legal reasoning. These challenges include: different directions (not including law), this paper suggests 2 that there is scope for using machine learning in law, and According to Google’s Rob Craft We are currently at year zero of the machine learning revolution. (Singh 2016) 1 3 Annual meeting of the Neural Information Processing Society https://link.springer.com/journal/10506 29 Is There a Place for Machine Learning in Law? pp. 29–32 1. ”The open-textured or fuzzy nature of language (and of first, hunting a fox with hounds did not confer rights of pos- legal concepts)” session; in the second, the whale harpooned by one individ- 2. ”Degrees of similarity and analogy” ual, and found by another on the beach was found, based on customs of whalers, to belong to the man who harpooned it 3. ”The representation of context” not to the one who found it. The decision in Popov v Hayashi 4. ”The symbol-grounding problem” was that Popov and Hayashi had equal interests in the ball; to 5. ”The representation of causation, conditionals and coun- reach such a decision, issues such as context, continuity play terfactuals” an important role, and an intelligent (artificial intelligence) legal system must be able to deal with such issues. Accord- 6. ”The balancing of reasons” ing to Franklin (Franklin 2012), none of the current artificial 7. ”Probabilistic (or default or non-monotonic) reasoning intelligence in law approaches, based on similarity with the (including problems of priors, the weight of evidence and two precedents, could have actually reached this decision. reference classes)” Achieving it, would require quantitative approaches includ- 8. ”Issues of the discrete versus the continuous” ing probability, fuzzy sets, and evidential reasoning, which may go a long way to complement logic based approaches 9. ”Understanding” towards an artificial intelligence based law systems. Franklin discusses the use of fuzzy set based approaches, to capture the nature of some concepts, or the similarity of Machine learning in rules with legal values a case to precedents. As an example, he considers, the con- In (Sartor 2002) several (legal) theory constructors are given cept ’vehicle’, in the ordinance ”No vehicles are allowed in in terms of rules and (legal) values promoted by them, in the park”, which obviously would refer first and foremost order to formalize the legal argument. First factors, i.e. ab- to cars, less to motorcycles/bicycles, and even less to roller stract features of a case which may influence the outcome skates. A fuzzy set of vehicles, defined by a membership of the case, are considered. Following (Berman and Hafner function µvehicle : U ! [0, 1], where U denotes a universe 1993), values underlying a case are introduced. For exam- of discourse of ’things’, would assign different degrees to ple, ⇡Liv stands for the fact ” ⇡ was pursuing his liveli- cars, motorcycles, bicycles, roller skates, for example, re- hood”, (⇡ denotes the plaintiff), or N poss stands for ” ( spectively denotes defendant) was not in possession”. A legal value 8 V is an objective pursued by the legal argument. Exam- > 1 if v is a car < ples of values include Less Litigation(LLit ), More produc- 0.8 if v is a motorcycle µvehicle (v) = tivity(MProd), More security of possession(MSec). A case : 0.5 if v is a bicycle > may be formalized as a collection of rules such as 0.1 if v is ”roller skates” The ordinance has as goal prevention of accidents in the d⇡Liv =) ⇧e promotes M prod park, and by consequence, the definition of the fuzzy set is d⇡land =) ⇧e promotes M sec (1) meant to reflect the common sense knowledge that cars can d⇡N poss =) ⇧e promotes LLit cause serious accidents, motorcycles less, bicycles even less, d Liv =) e promotes M prod and so on. The actual assignment of membership degrees is where ⇡Liv =) ⇧ means ”⇡ was pursuing his livelihood seen in (Franklin 2012) as one of the difficulties of adopt- is a reason why ⇡ should have a legal remedy against ”. ing a fuzzy set based approach. However, the researchers in To formalize, following (Sartor 2002), let {Vi , i = fuzzy systems know that while this issue is not trivial, fuzzy 1, . . . n} be a collection of legal values, where a minimal set based approaches have a rich collection of choices to ad- approach to ordering is adopted, such that the theory may dress it, including, learning the membership function. Fur- specify thermore, it should be noted that (1) in many applications, Vi < Vj ; i 6= j the relative magnitudes of the membership degrees matter more than their actual magnitude, and (2) where the abso- More over, it is assumed that lute magnitudes matter, they could and should be subject to [ V i < Vi [ Vj (2) a (machine) learning approach. j6=i The issue of similarity is of utmost importance in legal reasoning and to illustrate the difficulties in similarity eval- The plaintiff caught the ball in the upper portion of his glove but uation (Franklin 2012) refers to a celebrated case, Popov v was tackled and thrown to the ground by the crowd. The ball fell Hayashi, centered on the issue of possesion.4 The two prece- out and the defendant picked it up and put it in his pocket. The dents considered for the case, both involved hunting: in the plaintiff sued for conversion. Holding: The plaintiff and defendant had equitable claims and could not prove their case either way. 4 http://www.miblaw.com/lawschool/popov-v-hayashi-2002- Reasoning: Although the plaintiff proved intent to possess the wl-31833731-cal-super-ct-2002/: Popov v. Hayashi 2002 WL ball, he could not establish that he would have fully possessed 31833731 (Cal. Super. Ct. 2002) Case Name: Popov v. Hayashi the ball had he not been tackled by the crowd. If he could have Plaintiff: Popov Defendant: Hayashi Citation: 2002 WL established this, his pre-possessory interest would have constituted 31833731 (Cal. Super. Ct. 2002) Issue: Whether the defendant is a qualified right to possession which can support a cause of action liable for conversion when he picked up the home run ball that was for conversion. Judgment: The ball was sold for $450,000 and dropped by the plaintiff. Key Facts: Barry Bonds 73rd Homerun. the proceeds were divided equally. 30 Stephan Ralescu and Anca Ralescu MAICS 2017 pp. 29–32 Replacing [ in (2) by the maximum _, and using ^ for 2. To devise methods that can replace existing methods of minimum, it follows that argument and deliberation in legal settings about factual ⇣ W ⌘ issues. Vi < max Vi , j6=i Vj 3. To devise methods that mimic conventional methods of ⇣ V ⌘ (3) Vi > min Vi , j6=i Vj argument about factual issues in legal settings. 4. To devise methods that support or facilitate existing, or Equality of values must also be specified as part of the the- ordinary, argument and deliberation about factual issues ory. Then, enlarging upon (Sartor 2002), given the rules in legal settings by legal actors (such as judges, lawyers and jurors) who are generally illiterate in mathematical d↵1 =) e promotes V1 and formal analysis and argument.” d↵2 =) e promotes V2 (4) 5. To devise methods that capture some but not all ’ingredi- ... d↵n =) e promotes Vn ents of argument’ in legal settings about factual questions questions. one can construct the following: It can be claimed that achieving these purposes predict d↵1 &↵2 & . . . &↵n =) e - replace - mimic - support falls into the machine learning realm, requiring machine learning algorithms of possibly promotes different levels of sophistication. [min(V1 , V2 , . . . , Vn ), max(V1 , V2 , . . . , Vn )] Machine learning in the practice of law – the low Thus, considered together, rules (4) promote at least hanging fruit the smallest value, at most the largest value, and From the point of view of a typical approach to machine possibly values in between, i.e., those which lie in learning, data (usually, a lot) is needed to construct a ma- [min(V1 , V2 , . . . , Vn ), max(V1 , V2 , . . . , Vn )]. All promoted chine learning algorithm - a classifier, or a clustering algo- values can be expressed as convex combinations of rithm. Usually, such data is thought of as history on which V1 , V2 , . . . , Vn , that is, to base future predictions. The need to take into account his- tory is discussed in the conclusion section of (Sartor 2002), d↵1 & . . . &↵n =) e promotes w1 V1 + · · · + wn Vn (5) which suggests a history-subtheory. That would add a ’sense where wi 0, i = 1, . . . , n and w1 + · · · + wn = 1. For of history’ to a case, predicting a judge’s handling of a case different values of wi , i = 1, · · · , n (5) can generate any based on that judge’s history of opinions and their context. subset of the set of values {Vi , i = 1, · · · , n}. Interpreted as All of these could be attacked by machine learning methods. a probability, wi = P rob( to promote Vi ) can be obtained Issues on the representation of an argument, of an opinion, through a machine learning algorithm based on history of measures of similarity must be considered. Law, like other (similar) cases. social sciences, seldom uses a quantitative language, rather, The mechanism outlined above has the effect of produc- it is text-based. This means that solving the issues men- ing a continuum of legal values (even though to begin with, tioned above is not trivial. these form a discrete set), which in turn may lead to a con- Using machine learning to analyze judges’ personalities tinuum of possible decisions. and ruling tendencies helps tailor pleadings to their person- alities. Machine learning helps analyze attorney personali- ties and use those to decide who writes what in a law firm, Inference and machine learning - legal theory and evaluate a firm’s previous work and identify strength, and practice weaknesses and faults. This section touches upon the issue of inference in legal rea- This added dimension to legal theory and practice strad- soning. It takes its cue from (Tillers 1993) and references dles several disciplines, including psychometry, representa- therein, according to which ”the governing assumption of tion of uncertainty (e.g., fuzzy logic to represent meanings this body of law has been that all or practically all facts are of utterances, and similarity measures), probabilistic (point, uncertain and that proof of facts is always or almost always interval valued or imprecise probabilities), all to be used in a matter of probabilities”. The necessity of mathematical machine learning to build predictive algorithms of behavior. models of uncertainty (currently missing) in legal reason- As a recent example, with far reaching consequences, of ing is furthermore discussed in (Tillers 2011) and (Franklin behavior prediction, comes from the 2016 USA presiden- 2012) among others. tial elections: Cambridge Analytica5 used machine learning Since complex arguments about inferences from evidence to specifically target independents and other voters disen- rest on almost innumerable subjective judgments, (Tillers chanted with the status quo, with messages that appealed to 2011) proposes several purposes for mathematical and for- their personalities. A similar system that does the same - for mal analysis of inconclusive arguments about uncertain fac- judges, courts - could help build a litigation strategy, tailor tual questions in legal proceedings, as follows: language, and develop legal reasoning to the personality of the particular court. 1. ”To predict how judges and jurors will resolve factual is- 5 sues in litigation. https://cambridgeanalytica.org/ 31 Is There a Place for Machine Learning in Law? pp. 29–32 Conclusion Prakken, H., and Sartor, G. 1996. A dialectical model of as- We have discussed some preliminary ideas on the chal- sessing conflicting arguments in legal reasoning. In Logical lenges/issues that law research faces, which could be ap- Models of Legal Argumentation. Springer. 175–211. proached from an machine learning point of view. This pa- Prakken, H., and Sartor, G. 1997. Argument-based extended per only hinted at these issues and possible solutions using logic programming with defeasible priorities. Journal of ap- machine learning. Much is to be done, including a very thor- plied non-classical logics 7(1-2):25–75. ough understanding of quantitiative ideas in legal theory put forward by researchers in the legal profession. Prakken, H., and Sartor, G. 1998. Modelling reasoning with precedents in a formal dialogue game. In Judicial Applica- Acknowledgments tions of Artificial Intelligence. Springer. 127–183. The authors are grateful for the reviewers comments, some Prakken, H., and Sartor, G. 2002. The role of logic in very enthusiastic, some very negative, on the first draft of computational models of legal argument: a critical survey. this paper. All served to stimulate the authors’ thinking on In Computational logic: Logic programming and beyond. how to further approach the use of machine learning in law, Springer. 342–381. and are likely to inform their future study of this field. Rissland, E. L., and Skalak, D. B. 1991. Cabaret: rule interpretation in a hybrid architecture. International journal References of man-machine studies 34(6):839–887. Bench-Capon, T. 1997. Argument in artificial intelligence Sartor, G. 2002. Teleological arguments and theory-based and law. Artificial Intelligence and Law 5(4):249–261. dialectics. Artificial Intelligence and Law 10(1-3):95–112. Berman, D. H., and Hafner, C. D. 1993. Representing tele- ological structure in case-based legal reasoning: the missing Singh, J. 2016. The tech-legal aspects of machine learning: link. In Proceedings of the 4th international conference on Considerations for moving forward. NIPS2016, ML and the Artificial intelligence and law, 50–59. ACM. Law Workshop. Campbell, W.; Li, L.; Dagli, C.; Greenfield, K.; Wolf, E.; Skalak, D. B., and Rissland, E. L. 1992. Arguments and and Campbell, J. 2016. Predicting and analyzing factors in cases: An inevitable intertwining. Artificial intelligence and patent litigation. NIPS2016, ML and the Law Workshop. Law 1(1):3–44. Franklin, J. 2012. Discussion paper: How much of com- Tillers, P. 1993. Intellectual history, probability, and the law monsense and legal reasoning is formalizable: A review of of evidence. conceptual obstacles. Law, Prob. & Risk 11:225. Tillers, P. 2011. Trial by mathematics - reconsidered. Law, Hafner, C. D. 1978. An information retrieval system based probability and risk 10(3):167–173. on a computer model of legal knowledge. 32