=Paper=
{{Paper
|id=Vol-2632/MIREL-19_paper_5
|storemode=property
|title=Machine Learning and the General Repercussion on Brazilian Supreme Court: Applying the Victor Robot to Legal Texts
|pdfUrl=https://ceur-ws.org/Vol-2632/MIREL-19_paper_5.pdf
|volume=Vol-2632
|dblpUrl=https://dblp.org/rec/conf/jurix/HartmannB19
}}
==Machine Learning and the General Repercussion on Brazilian Supreme Court: Applying the Victor Robot to Legal Texts==
Machine learning and the general repercussion on Brazilian Supreme Court: applying the Victor robot to legal texts 1 2 Fabiano Hartmann , Debora Bonat 1 PhD, Law School-University of Brasília, Brasília, Brazil – ppgd.unb.br https://unb.academia.edu/FabianoHartmann 2 PhD, Law School-University of Brasília, Brasília, Brazil – ppgd.unb.br https://deborabonat.academia.edu Abstract. This paper aims to present the development of an instrumental solution to a necessity raised from an artificial intelligence project, latter called Victor robot project. The Victor robot demands a methodological combination of the reasoning of the areas of software engineering, computer science and Law. For its unprecedented factor, all researchers must develop knowledge in an intense form, while working in different thought process, language and very specific legal texts in a huge volume of data. In a second part, this paper presents some sui generis features of general repercussion as a constitutional filter and a possible field for ontological development for machine learning, and important to understand your potential application at Supreme Court activity. Finally, the article presents some steps of the project that is still in progress, but is already considered the largest artificial intelligence project in the Brazilian judiciary, which has 100 million cases in stock. Keywords: Victor project. machine learning. methodology. general repercussion. text classification. decision support. 1 Introduction1 1 {Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).} 2 The first goal of this paper is to present the development of an instrumental solution to a necessity raised from the work plan established in a artificial intelligence project between the University of Brasilia and the Brazilian Supreme Court (STF), called "Research and development project on machine learning about judicial data on general repercussion in the Supreme Court", latter called Victor Robot Project. The Victor robot project demands knowledge and researchers from the areas of software engineering, computer science and Law. For its unprecedented factor, all researchers must develop knowledge in a intense form, while working in different thought process, language and very specific methodologies. Therefore appears the need of an integrative methodology, that allows the implementation of advanced artificial intelligence instruments for the legal area. For the development values and guidelines must be really clear, and they will be addressed by deductive method, from the findings on Artificial Intelligence (AI), Law and the agile methods. This work plan foresees the engagement of three different areas of knowledge: software engineering, computer science and law for the development of a unique innovative solution in an environment with thousands of lawsuits and millions of legal texts, all this in absolutely non-structured data. This scenario represents a series of challenges to be faced for areas that are traditionally structured on diametrically different rationalities but can – in a convergent and synergic way – develop the steps (stages, phases) to achieve a common central goal: the development of a system based in Machine Learning (ML) algorithms and, possibly, Deep Neural Networks to be used in a specific phase of the judicial process called “general repercussion”, a brazilian (typical country of civil law tradition) adaptation to an approach to the solutions of the common law tradition. The use of the system will allow the improvement of the contingency of judicial process in Brazil. To start the planned research work, a methodology was customized: it is more familiar to the software development area and still insipid in the law studies. The first objective of this paper is to present and justify these choices. As two of the main critical factors of the project were a short calendar and a strict budget, as well as the innovation character of the research – and its dynamics needs of correction and adaption, the agile scrum methodology presented itself as extremely interesting. Nevertheless, the legal character of the research demanded a few adjustments on the methodology framework. In the same sense of the necessary adjustments to justify a specialized research methodology, it was needed to fixate some conceptual cuts, bounding properly the research field. In this sense, it will be presented the concept and the impacts of the general repercussion in the process stock and the strategies of application of machine learning for identification and classification of themes in the general repercussion. This line of thinking it is not only valid for lawyers, but to all legal practitioners. Activities in legal administration logistics that could be performed in a fraction of 3 time with a high level of accuracy, may be benefit, and allows human talent to be concentrate in strategic areas. The benefits of a research on AI and Law are yet underestimated, since one cannot limit remotely precisely the combination of human knowledge and AI tools. However, it is possible to previously define some topics that should necessarily guide an AI research in a way to maximize social benefits with accuracy and speed. This research is in this field. 2 A methodology to legal text classification and decision support solution 2.1 Rating decision support solution with agile method: As two of the main critical factors of the project were a short calendar and a strict budget, as well as the innovation character of the research – and its dynamics needs of correction and adaption, the agile methodology presented itself as extremely interesting. Nevertheless, the legal character of the research demanded a few adjustments on the methodology framework2. A crucial issue to the accuracy is to identify/decide which is the level of symmetry between computer activity and what really happens in the human reasoning. With this perception, the idea of the story of seven blind wise men makes sense again. The specificities, contingencies, bias - although possibly existing or true (in the indian story sense) do not match the holistic, global or collective concept. This perception is crucial to define any work method between AI and law3. 2 Therefore, it is not the purpose of this paper to present: 1) the work plan of the “Machine Learning of the general repercussions in the Brazilian Supreme Court”; 2) The state of the art in artificial intelligence (AI) or IA and law. These approaches will be made here only to in a way to instrumentalize the methodological definition. The subjects listed above are part of the research in a broader way, and will be released properly when opportune. 3 An in depth historic view of the development process and AI perspectives is described in the paper "A review of artificial intelligence”, from E.S. Brunette, R.C. Flemmer and C.L.Flemmer. School of Engineering and Advanced Technology of Massey University, New Zeland. There, beyond the historic perspective, there is the register of Mikawa (2004) work on the variables of the human mind: “consciousness, preconsciousness and uncon- sciousness. In this model most data processing is done in the non-conscious states. He therefore proposed a system where the level of information processing changed, based on visual information being received. In his model, external information processing is con- ducted when the robot is awake. However, when the robot is in sleep mode, external infor- mation processing is reduced and more internal information possessing is conducted.” (BRUNETTE, 2009, p. 386) [All content following the pages of the paper was uploaded by Claire Flemmer on 25 March 2014.] 4 The synergy between artificial intelligence and law was detected by Professor Edwina L. Rissland, from the Department of Computer Science from the Universi- ty of Massachusetts, in a very traditional article in 2003. In there, [16, 2003] situ- ated AI and Law as a singular research field for the AI. These studies allowed an increase for topics beyond the typical law ones - as dogmatic, hermeneutic, legal argumentation, theories of justice or the best decision - more specifically about the insights and the legal praxis own logic in a broader way. In the AI general research outlook can be identified nuances and limitations of the existing techniques for the law functionality, as well as catalyzing elements for the development of the new sustentable approaches. The mentioned Rissland work is strengthened by the observation of the consolidation of standard legal argumenta- tion theories, with the refining of the design of models to analyze and evaluate le- gal reasoning. A process characteristic of software engineering, created by Kent Back, Extreme Programming (XP)4 is a set of development practices guided by values. These practices and values seek to face the most common difficulties in software devel- opment, they are: missing deadlines and overspending to develop solutions. For dealing with extremely delimited values and practices equally simple, when ap- plied together shows a considerable synergistic potential. Better decisions, quick answers, efficient communication and wise efforts investments are noble goals of a research, especially of researches with the characteristics pointed above5. There is emphasis in the interactions and personal aspects of the teams involved. The process itself - as well as its tools - are of development and innovation. The devel- oped work is important and - probably a defining factor for the methodology choice - the customer collaboration (as the guideline set above of law and its legal language`s characteristics working as a customer). This metaphorical use of law as 4 It is not the intention of the present paper to develop a dissertation on the subject. 5 Tripp, Saltz and Turk recently reordered the cautions for the use of the agile methodology, and guided the customization for the present project: “We believe that agile in software de- velopment is an “instance” of agility in projects. Agile can be used in many other project contexts beyond software development, […] Some foundational questions include: • What guidance can we provide to create and sustain better agile and lean behaviors and more suc- cessful outcomes? • How can we incorporate other functions, such as architecture and pro- duction support, into agile and lean frameworks?• How can organizations and cultures re- structure to support these philosophies? • What are the measurable outcomes of using agile techniques? • What additional metrics might a team use to measure team performance? • What are the measurable differences in outcomes when using traditional vs agile tech- niques? • What are ways that we can create a repository of knowledge, experiences, cases and empirical data that could be used by research and industry to leverage and expand our understanding of and practical skills in agile techniques?” [18, p. 5470] 5 a customer allows the process to be tolerant with the unexpected (typical of inno- vation research), and give quick answers to the problems - enabling changes, tests and adjustments6 - a typical Victor robot routine! The agile method has the characteristIcs7 of a cycle (SDLC) that where very well grouped by [13, p. 198-199): Table 1. The agile method has the characteristics - number characteristics 1 Conduct a survey and assess the feasibility of information systems development project. 2 Study and analyze the information systems that are running 3 Determine the requests of the information system users. 4 Select the best solution or problem solving 5 Determine the hardware and software. 6 Design a new information system 7 Build a new information system. 8 Communicate and implement the new information system. 9 Maintain and repair / improve the new information system if necessary. What should guide the agile methodology application is the verification of its po- tential benefits in face of a traditional work methodology. This paper agrees with the statement. There are a series of studies in this sense, and the main reasons for its use are empirically verified.8 The choice for the agile method imposes severity, risking otherwise that the bene- fits do not overcame the difficulties and imprecisions. [18] synthesizes this need in three criteria: 1) the establishment of individual practice metrics, even when they 6 "There are several team and environmental characteristics that drive the extent to which agile methods and practices can achieve their full potential. Hence, when conceptualizing the potential impact of agile methods, researchers must consider and document the charac- teristics of the environment that enable agile practices to be successfully implemented." [18, p. 5466] 7 SDLC (Software Development Life Cycle, “is the stages of work performed by system analysts and programmers in building an information system.” [13, p. 198-199] 8 There are several studies that “Describe and measure the team and environmental charac- teristics of the project, 2) Measure the use of multiple agile practices, either qualitatively or quantitatively, and 3) Illustrate theoretically how and when the unique nature of agile methods influences outcomes.” [18,p. 5466] 6 are qualitative and not quantitative, 2) the documentation and communication and 3) when applying the theory, combining the nature of the agile method with the work environment. This convergence is mandatory. 3 General repercussion elements for Machine Learning application 3.1 General repercussion as a filter The brazilian’s institute of the general repercussion was instituted in order to op- erate as a recursional filter, thus avoiding the knowledge of extraordinary appeals, whose constitutional cause is irrelevant or of sole and exclusive interest of the par- ties. It is understood that a cause demonstrates relevance when it has real and un- doubted importance, standing out against those other causes that involve the same object. Thus, relevant issues are those that have great value or interest, which is why they are clamoring for guidance from the Federal Supreme Court. With regard to the requirement of transcendence, it is understood that the cause can not be limited to matters of individual interests. That is why the experts and the jurisprudence reinforce the need for transcendence, that is, the analysis must exceed the individual interests and set up a true collective interests in the examina- tion of matter. In this way, the general repercussion allows that in cases founded on the same controversy, one or more resources that adequately and fully repre- sent the leading cases be selected, to be analyzed by the Supreme Court, according to the Code of Civil Procedure; the effects of the decision on only one appeal will bind so many others of equal matter, which, of course, will denote greater effec- tiveness and celerity in the jurisdictional activity. This scenario goes back to the adoption of the system of precedents by the Brazili- an procedural system. The flow of administrative tasks, both for the purpose of di- recting the judicial process itself and for developing services in support of judicial activity, has taken a great deal of effort, time, resources and other valuable ele- ments of the public service. Nor is it today that these flows are ever more precise and produce many elements of efficiency and effectiveness metrics. Therefore, there is a good substrate for the application and measurement of gains in approved and implemented Artificial In- telligence (AI). Based on the idea of a flowchart, such as the representation of a sequence of events that map some (some) types of decision, the tasks often go through needs for treatment, reading and understanding of data, classification and comparison with pre-existing parameters or indication of a new possible standard. Terminologies may vary according to the representation in the flow, but the activi- ties usually pass through an (optical or other sensory recognition) of the data, a 7 structuring of this data (a form of storage, organization), an optimization of infor- mation relevant to some form of classification and decision of the path to follow in the flow or - possibly - change of flow. For a first step, it is possible to contribute, for example, to the development of "knowledge", that is to say with technology of optical character recognition or similar, that allows the recognition of text characters in images, transforming them into text capable of edition. Even the recognition of handwritten images is possi- ble. For the structure of data, the AI can contribute, for example, with organization algorithms forming structures oriented by their function, purpose and conditions for storage (classic vectors, lists, stacks, trees, etc). Both the form of organization and the methods used must be made according to a methodology appropriate to the characteristics of the judicial data. Classification is one of the most frequent features offered by AI. For example, a multi-layered neural network architecture can perform, with very acceptable parameters of accu- racy and verification and ethical validation, treatment services and data structuring to function as a real input data sorter. 4 Text classification and decision support solution combined: the Victor robot 4.1 The beginning of Victor robot The Victor robot was designed to act in the flow of management of court proceedings in the stage of evaluation and framing of selected theme on general repercussion. Due to the quality of the data at the beginning of this flow, the server (without the help of ML) takes considerable time (close to 30 minutes) to locate and organize the relevant data to be read and interpreted to perform the classification. This statistic, combined with another statistic: approximately 400 new processes arrive at the beginning of this flow per business day. This identified a serious management problem that absorbs something close to 200 hours/worker for organization and flow initiation (pre-activity). The quality of initial text and image data is also highly variable due to the large number of process sources addressed to the Supreme Court and the variation in electronic process systems and eventual digitization capabilities. Briefly, the project aims the development and application of the newest Artificial Intelligence concepts and techniques, especially Machine Learning for relevant needs in terms of processing, classification of procedural parts and classification of temes/classes on general repercussion management at Supreme Court and support the decisions of the technical team. The objectives are to increase the speed of processing, increase the accuracy (accuracy) in the involved steps and optimize the human resources to perform more strategic activities to the Court. 8 The following stages of the Victor Project was planned: 1) Preparation and structuring of the General Repercussions database for training of machine learning models. 2) Evaluation of more efficient training algorithms and strategies for the context of General Repercussions, including deep artificial neural networks. 3) Prototyping and training of the chosen algorithms including their evaluation. 4) Preparation of the communication architecture for real-time process classification along with the interface for recording possible errors in model responses, including integration with the STF solution park. Table 2. Goals of R&D Victor number characteristics 1 Composition of research and development base 2 Preparation of General Repercussions (RG) database for analysis 3 Building an Optimal Ontology 4 Select rating methods 5 Optimize selected classifiers 6 Match classifiers for process ratings against RGs 7 End user interface 8 Evaluate inference methodology 9 Publication of Results 10 Create Bank of New General Repercussion Themes (TRGs) 11 Generate New TRG Classifier 12 Generate New PC (procedural parts) Classifier 13 Prototype training automation engine and use of new TRG classifiers 14 Detail STF Team TRG Classifier 15 Detail STF Team PC Classifier A mapping of human activity was done and an identification of the most recurring themes of general repercussion, and the division of data for testing, modeling and comparison between the activity simulated by human legal experts and the machine were made. Thus, from a preprocessing model, a classifying system of 28 general repercussion classes is being developed, selected by association with a large number of similar court lawsuits. The most several simulations indicate a high accuracy index. 4.2 Some results of Victor robot All goals (Table 2.) are in execution and improvement in agile medium. In texte environment with some encouraging results: Table 3. Stages of research steps (stages reported in stages of accountability of the Research Project at the University of Brasilia) 9 number step stage 1 Preparation and structuring of Stage completed in full. The data from the General Repercussions lawsuits were collected from the STF database for training of database, preprocessed and entered machine learning models: into a structured database hosted on the research laboratory's servers. 2 Evaluation of algorithms and Several hypotheses of uses of training strategies more preprocessing methods and classifiers efficient for the context of were investigated. In application to 28 General Repercussions, themes of general repercussion, the including deep artificial best results achieved using the neural networks. XGBoost technique model, with an average accuracy (F1-Score) above .90 3 Communication architecture The RGs classification feature has preparation for real-time been adapted to work within the STF process classification along (STF-Digital) technology park. with the interface for recording possible errors in model responses interactively, including integration with the STF solution park. Throughout its execution, there were changes in the way to extract the text from pdfs files, a fact that required reprocessing of the entire database. Once this extraction is completed it will begin the machine learning remodeling process. 5 Conclusions This way it is intended to conduct an intensive research, in a limited time, optimiz- ing the available resources, with instruments that allow problems identification and necessary adjustments. From the human relations point of view, although the team has different theoretical backgrounds, it is intended to keep the team united, stable and productive. Thereby, it is understood that the elected methodology is justified, with its modifications to attend the research work plan. The general repercussion allows that in cases founded on the same controversy, one or more resources that adequately and fully represent the leading cases be se- lected, to be analyzed by the Supreme Court, according to the Code of Civil Pro- cedure; the effects of the decision on only one appeal will bind so many others of equal matter, which, of course, will denote greater effectiveness and celerity in the jurisdictional activity. 10 The Victor Project, although very recent, has been presenting very interesting, strategic and relevant results in an attempt to reduce the processing time of court proceedings. As said several hypotheses of uses of preprocessing methods and classifiers were investigated. In application to 28 themes of general repercussion, the best results achieved using the XGBoost technique model, with an average accuracy (F1-Score) above .90. References 1. Blight, Karin Johansson. Artificial Intelligence, AI biases and risks, and the need for AI-regulation and AI ethics: some examples. 2018. DOI:10.13140/RG.2.2. 23455.00160, https://www.researchgate.net/publication/326377798 _Artificial_ Intelligence_AI_biases_and_risks_and_the_need_for_AIregulation_and_AI_ethics_so me_examples_17_Nov_2018, on 11/03/2019. 2. Brundage, Miles, et.al. Scaling Up Humanity: The Case for Conditional Optimism about Artificial Intelligence. In: EPRS. Euroepean Parliamentary Research Service. Should we fear artificial intelligence? Europian Parliament. 2018. http://www.europarl.europa.eu/RegData/etudes/IDAN/2018/614547/EPRS_IDA(2018) 614547_EN.pdf. On11/03/2019. 3. Brunette, E.S; Flemmer, R.C.; Flemmer, C.L.. A review of artificial intelligence. School of Engineering and Advanced Technology of Massey University. New Zeland. 2009. DOI: 10.1109/ICARA.2000.4804025. 4. Datta, Shoumen Palit Austin. The Elusive Quest for Intelligence in Artificial Intelligence. MIT Auto-ID Labs. Massachusetts Institute of Technology – MIT. https://dspace.mit.edu/bitstream/handle/1721.1/108000/Intelligence_AI.pdf?sequence= 11. Acess on 07/02/2018. 5. Davies, Jim; Francis Jr., Anthony G.. The Role of Artificial Intelligence Research Methods in Cognitive Science. Institute of Cognitive Science, Carleton University. Ottawa.2013:https://pdfs.semanticscholar.org/e159/ea04f23303091742e81a5ba25 a4f62e40bc7.pdf, On 07/02/2018. 6. Engle, Eric Allen. An Introduction to Artificial Intelligence and Legal Reasoning: Using x Talk to Model the Alein Tort Claims Act and Torture Victim Protection Act. Richmond Journal of Law & Technology. Volume XI, Issue 1. 2004. http://jolt.richmond.edu/jolt-archive/v11i1/article2.pdf. On 07/02/2018. 7. Gray, Pamela. Artificial Legal Intelligence. Aldershot: Brookfield, EUA.1997 8. Jahanzaib, Shabbir; Anwer, Tarique. Aritificial Intelligence and its Role Near Future. Journal of latex class files, vol. 14, n. 8, august 2015. https://arxiv.org/pdf/1804.01396.pdf. On 25/02/2019. 9. Khmelevsky, Youry; Li, Xitong; Madnick, Stuart. Software development using agile and scrum in distributed teams. Management Sloan School. Massachusetts Institute of Techonology – MIT. 2017. http://web.mit.edu/smadnick/www/wp/2017-02.pdf, On 07/02/2018. 10. Kubovic, Ondrej; Kosinár, Peter; Jánosik, Juraj. Can Artificial Intelligence PowerFuture Malware? ESET White Paper. Enjoy Safer Technology. Disponível emhttps://www.welivesecurity.com/wp-content/uploads/2018/08/ Can_AI_Power_Future_Malware.pdf. 11/03/ 2019. 11. Maalel, Ahmed; Hadj-Mabrouk, Habib. Contribution of Case Based Reasoning (CBR) in the Exploitation of Return of Experience. Application to Accident Scenarii in Railroad Transport. Cornell University Library. 2012. https://arxiv.org/pdf/1203.0656. 11 On 07/02/2018 12. Maini, Vishal;Sabri, Samer. Machine Learning for Humans. Published August 19,2017. Edited by Sachin Maini. https://everythingcomputerscience. com/books/Machine%20Learning%20for%20Humans.pdf.On 08/03/2019. 13. Permana, Putu Adi Guna. Scrum Method Implementation in a Software Development Project Managemente. Bradford, UK: (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 6, No. 9, 2015, p. 198-204. ISSN : 2156- 5570(Online). DOI: 10.14569/issn.2156-5570 14. Prieto, Heloisa; Godfrey, John. O tempo tem histórias. In: https://otempotemhistorias.wordpress.com. On 07/02/2018. 15. Richter, Michael M.; Aamodt, Agnar. Case-based reasoning foundations. The Knowledge Engineering Review, Vol. 20:3, 203–207. 2006, Cambridge University Press doi:10.1017/S0269888906000695 Printed in the United Kingdom. 16. Rissland EL, Ashley, KD, Branting: Case-based reasoning and law. The Knowledge Engineering Review, Vol. 00:0, 1-4. 2005, Cambridge University Press DOI: 10.1017/S0000000000000000000 17. Stern, Simon. Introduction: Artificial Intelligence, Technology and the Law. Pré-paper in the 8 University of Toronto Law Journal __ (2018). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3092887. On 07/02/2018. 18. Tripp, John F.; Saltz, Jeffrey; Turk, Dan. Thoughts on Current and Future Research on Agile and Lean: Ensuring Relevance and Rigor. Proceedings of the 51st Hawaii International Conference on System Science. CC BY-NC-ND 4.0. 2018. p. 5465 – 5471. http://hdl.handle.net/10125/50570. ISBN: 978-0-9981331-1-9. On 07/02/2018.