Legal Artificial Intelligence - Have You Lost a Piece from Jigsaw Puzzle? Changlong Sun, 1 Yating Zhang,1 Qiong Zhang,1 Xiaozhong Liu,2 1 Alibaba Group, Hangzhou, Zhejiang, China 2 Indiana University Bloomington, Bloomington, Indiana, USA changlong.scl@taobao.com, ranran.zyt@alibaba-inc.com, liu237@indiana.edu, qz.zhang@alibaba-inc.com Abstract and data availability perspectives. While a number of very recent studies are explored and strategically reviewed, a new Legal artificial intelligence, as a special track in AI, is playing an increasingly important role to address dif- learning framework, Joint Multi-Stage Case Representation ferent kinds of legal needs, and to provide vital po- Learning (JMCRL), is proposed to characterize the seman- tentials to help clients, lawyers and judges to access, tic, logic, and knowledge context of a legal case. understand, predict, and generate legal information in the context of legal domain knowledge. Legal AI, how- Recent Legal Deep-AI Efforts with Different ever, can be more challenging than other AI topics, and a comprehensive multi-view legal case representation, Stages across different stages, can be essential for a number Unlike most existing topics in machine learning, a legal of downstream tasks, e.g., legal prediction, court debate mining, and legal QA/chatbot. In this paper, we explore case, or to say a legal information need from user, may ex- the theoretical and methodological foundations, poten- perience different stages. To the best of our knowledge, so tials and challenges to address this novel problem. far, few studies dynamically explore the case representation across different legal stages. Recently, however, deep learn- ing has been successfully investigated to address a number Introduction of AI problems in each stage along, which can be summa- Over the past few years, a number of domains, like text min- rized as the followings. ing, computer vision, and auto-drive, have reaped the bene- Case prior knowledge stage addresses the legal fits of embracing data-driven methods along with the emerg- case/information-need context characterization problem, ing deep learning models. These approaches simplify sys- e.g., the relevant precedents, statutes and undisputed legal tems while minimize the potential for humans to introduce concepts retrieval from the legal databases and characterize their own biases. More importantly, such enabling technolo- the case context. The legal contextual information can pro- gies has been commercialized to satisfy various kinds of vide essential information to represent the target case/need. needs from massive users. Legal domain AI, however, can For instance, (Li et al. 2018) locate the relevant statuses from be more challenging and bewildering than other text min- DB, as the target case’s context, by using CNN plus Corre- ing/NLP disciplines, and some studies even expressed the lation Matrix that could cope with ambiguity and variability concern that the exaggeration of AI in legal area backfired, problems. and machine should not step into this serious domain (Mills In a pre-trial stage, the case indictment and evidences, 2016). In this context, legal AI investigation can be critical from plaintiff or defendant, can provide key information to while such needs are both necessary and inevitable. For in- predict the legal decisions. (Zhou et al. 2019), for instance, stance, based on New York Times report, “Trial judges are proposed a novel multitask learning model to represent the suffering from ‘daunting workload’1 is becoming an increas- case by using plaintiff (buyer), defendant (seller), and in- ingly critical issue, which challenges the efficiency of le- dictment (dispute) information in an eCommece ecosys- gal justice ecosystem in different nations. According to the tem. More importantly, authors found the legal knowledge report of statistics, the typical active federal district court (graph) can play an important role for case representation judge closed around 250 cases in a year, therefore, applying learning, e.g., ablation test showed legal knowledge en- novel artificial legal intelligence techniques to facilitate the hanced case representation can improve the model perfor- lawsuit process so as to alleviate the overwhelmed workload mance by 5%. of judges is of great significance (OECD 2013). In the trial court stage, different parties, like plaintiff, In this study, we investigating the opportunities and chal- defendant, judge and lawyer, have the chance to change the lenges in legal AI from case representation, learning model sentence result and the associated case representations in a Copyright c 2020, Association for the Advancement of Artificial court debate context. While the debate representation learn- Intelligence (www.aaai.org). All rights reserved. ing can be more challenging, more recently, (Duan et al. 1 http://tiny.cc/tbo95y 2019) proposed a novel deep debate representation learning Figure 1: Example Dialog in Court Debate Dataset framework. As the most interesting finding, authors proofed Last but not least, we can further enhance the performance the role information can be more important than legal knowl- of the task(s) in the decision stage by leveraging multi-task edge and case global information for debate mining. Various learning. In a legal context, the final decision may highly types of information can all contribute to the learning tasks, likely associate different sub-tasks, e.g., related articles pre- e.g., debate summarization. diction, penalty calculation, and reason generation. A multi- task framework can help the model better optimize the rep- Joint Multi-Stage Case Representation resentation parameters while enabling the communications Learning (JMCRL) among the tasks, which have been proofed as an effective means to enhance the legal AI jobs. In the paper, we propose a novel legal case representa- Unfortunately, data barrier restricts JMCRL investiga- tion learning framework by comprehensively integrating tions and implementations. While the case database, court four different stages: Case Prior Knowledge (learning) debate corpus, and legal knowledge graph are increasingly stage, Pre-trial stage, Trial Court Stage, and final Deci- available for legal AI research, no dataset can interconnect sion stage. Figure 1 depicts this model. them for cross-stage case representation learning. In the next It is clear, legal case representation learning, comparing step, we will be working on this problem. Efforts will be with other types of learning tasks, can be more challenging made to create novel dataset to enable future legal AI stud- because of the following reasons. ies. First, the optimized legal data analytics/mining/prediction solutions may need to explore the understanding of their im- References plications across different stages, and the case representation Duan, X.; Zhang, Y.; Yuan, L.; Zhou, X.; Liu, X.; Wang, T.; Wang, could change significantly when the same factor transferring R.; Zhang, Q.; Sun, C.; and Wu, F. 2019. Legal summarization from one stage to another. For instance, the ‘contract assess- for multi-role debate dialogue via controversy focus mining and ment result’ (of the target case) may change from pre-trial multi-task learning. In Proceedings of the 28th ACM International stage to trial court stage with the additional input from plain- Conference on Information and Knowledge Management, 1361– tiff, and the prediction result could change correspondingly. 1370. ACM. Second, a multi-view learning should be used to encapsu- Li, C.; Ye, J.; Ge, J.; Kong, L.; Hu, H.; and Luo, B. 2018. A late the heterogeneous information of the target case. Differ- novel convolutional neural network for statutes recommendation. ent factors, e.g., the information from different parties, can In Pacific Rim International Conference on Artificial Intelligence, play different roles in the learning model, and more sophis- 851–863. Springer. ticated representation learning algorithm should be applied Mills, M. 2016. Artificial intelligence in law: The state of play to address this challenge. 2016. Thomson Reuters Legal executive Institute. Third, different kinds of information should be projected OECD. 2013. What makes civil justice effective? OECD Eco- into two kinds of representation spaces, semantic space and nomics Department Policy Notes (18). legal knowledge space. For instance, both (Duan et al. 2019) Zhou, X.; Zhang, Y.; Liu, X.; Sun, C.; and Si, L. 2019. Legal in- and (Zhou et al. 2019) found legal knowledge (graph) can telligence for e-commerce: Multi-task learning by leveraging mul- be nontrivial for legal case representation, and the linguis- tiview dispute representation. In Proceedings of the 42nd Interna- tic features, like word, sentence, utterance (in debate), and tional ACM SIGIR Conference on Research and Development in sequential information should be projected into the legal Information Retrieval, 315–324. ACM. knowledge space to enhance the representation accuracy.