Legal Artificial Intelligence - Have You Lost a Piece from Jigsaw Puzzle?

                        Changlong Sun, 1 Yating Zhang,1 Qiong Zhang,1 Xiaozhong Liu,2
                                               1
                                       Alibaba Group, Hangzhou, Zhejiang, China
                                    2
                               Indiana University Bloomington, Bloomington, Indiana, USA
          changlong.scl@taobao.com, ranran.zyt@alibaba-inc.com, liu237@indiana.edu, qz.zhang@alibaba-inc.com


                            Abstract                               and data availability perspectives. While a number of very
                                                                   recent studies are explored and strategically reviewed, a new
     Legal artificial intelligence, as a special track in AI, is
     playing an increasingly important role to address dif-
                                                                   learning framework, Joint Multi-Stage Case Representation
     ferent kinds of legal needs, and to provide vital po-         Learning (JMCRL), is proposed to characterize the seman-
     tentials to help clients, lawyers and judges to access,       tic, logic, and knowledge context of a legal case.
     understand, predict, and generate legal information in
     the context of legal domain knowledge. Legal AI, how-          Recent Legal Deep-AI Efforts with Different
     ever, can be more challenging than other AI topics, and
     a comprehensive multi-view legal case representation,                           Stages
     across different stages, can be essential for a number        Unlike most existing topics in machine learning, a legal
     of downstream tasks, e.g., legal prediction, court debate
     mining, and legal QA/chatbot. In this paper, we explore
                                                                   case, or to say a legal information need from user, may ex-
     the theoretical and methodological foundations, poten-        perience different stages. To the best of our knowledge, so
     tials and challenges to address this novel problem.           far, few studies dynamically explore the case representation
                                                                   across different legal stages. Recently, however, deep learn-
                                                                   ing has been successfully investigated to address a number
                        Introduction                               of AI problems in each stage along, which can be summa-
Over the past few years, a number of domains, like text min-       rized as the followings.
ing, computer vision, and auto-drive, have reaped the bene-           Case prior knowledge stage addresses the legal
fits of embracing data-driven methods along with the emerg-        case/information-need context characterization problem,
ing deep learning models. These approaches simplify sys-           e.g., the relevant precedents, statutes and undisputed legal
tems while minimize the potential for humans to introduce          concepts retrieval from the legal databases and characterize
their own biases. More importantly, such enabling technolo-        the case context. The legal contextual information can pro-
gies has been commercialized to satisfy various kinds of           vide essential information to represent the target case/need.
needs from massive users. Legal domain AI, however, can            For instance, (Li et al. 2018) locate the relevant statuses from
be more challenging and bewildering than other text min-           DB, as the target case’s context, by using CNN plus Corre-
ing/NLP disciplines, and some studies even expressed the           lation Matrix that could cope with ambiguity and variability
concern that the exaggeration of AI in legal area backfired,       problems.
and machine should not step into this serious domain (Mills           In a pre-trial stage, the case indictment and evidences,
2016). In this context, legal AI investigation can be critical     from plaintiff or defendant, can provide key information to
while such needs are both necessary and inevitable. For in-        predict the legal decisions. (Zhou et al. 2019), for instance,
stance, based on New York Times report, “Trial judges are          proposed a novel multitask learning model to represent the
suffering from ‘daunting workload’1 is becoming an increas-        case by using plaintiff (buyer), defendant (seller), and in-
ingly critical issue, which challenges the efficiency of le-       dictment (dispute) information in an eCommece ecosys-
gal justice ecosystem in different nations. According to the       tem. More importantly, authors found the legal knowledge
report of statistics, the typical active federal district court    (graph) can play an important role for case representation
judge closed around 250 cases in a year, therefore, applying       learning, e.g., ablation test showed legal knowledge en-
novel artificial legal intelligence techniques to facilitate the   hanced case representation can improve the model perfor-
lawsuit process so as to alleviate the overwhelmed workload        mance by 5%.
of judges is of great significance (OECD 2013).                       In the trial court stage, different parties, like plaintiff,
   In this study, we investigating the opportunities and chal-     defendant, judge and lawyer, have the chance to change the
lenges in legal AI from case representation, learning model        sentence result and the associated case representations in a
Copyright c 2020, Association for the Advancement of Artificial    court debate context. While the debate representation learn-
Intelligence (www.aaai.org). All rights reserved.                  ing can be more challenging, more recently, (Duan et al.
    1
      http://tiny.cc/tbo95y                                        2019) proposed a novel deep debate representation learning
                                        Figure 1: Example Dialog in Court Debate Dataset
framework. As the most interesting finding, authors proofed           Last but not least, we can further enhance the performance
the role information can be more important than legal knowl-       of the task(s) in the decision stage by leveraging multi-task
edge and case global information for debate mining. Various        learning. In a legal context, the final decision may highly
types of information can all contribute to the learning tasks,     likely associate different sub-tasks, e.g., related articles pre-
e.g., debate summarization.                                        diction, penalty calculation, and reason generation. A multi-
                                                                   task framework can help the model better optimize the rep-
     Joint Multi-Stage Case Representation                         resentation parameters while enabling the communications
              Learning (JMCRL)                                     among the tasks, which have been proofed as an effective
                                                                   means to enhance the legal AI jobs.
In the paper, we propose a novel legal case representa-               Unfortunately, data barrier restricts JMCRL investiga-
tion learning framework by comprehensively integrating             tions and implementations. While the case database, court
four different stages: Case Prior Knowledge (learning)             debate corpus, and legal knowledge graph are increasingly
stage, Pre-trial stage, Trial Court Stage, and final Deci-         available for legal AI research, no dataset can interconnect
sion stage. Figure 1 depicts this model.                           them for cross-stage case representation learning. In the next
   It is clear, legal case representation learning, comparing      step, we will be working on this problem. Efforts will be
with other types of learning tasks, can be more challenging        made to create novel dataset to enable future legal AI stud-
because of the following reasons.                                  ies.
   First, the optimized legal data analytics/mining/prediction
solutions may need to explore the understanding of their im-                                References
plications across different stages, and the case representation    Duan, X.; Zhang, Y.; Yuan, L.; Zhou, X.; Liu, X.; Wang, T.; Wang,
could change significantly when the same factor transferring       R.; Zhang, Q.; Sun, C.; and Wu, F. 2019. Legal summarization
from one stage to another. For instance, the ‘contract assess-     for multi-role debate dialogue via controversy focus mining and
ment result’ (of the target case) may change from pre-trial        multi-task learning. In Proceedings of the 28th ACM International
stage to trial court stage with the additional input from plain-   Conference on Information and Knowledge Management, 1361–
tiff, and the prediction result could change correspondingly.      1370. ACM.
   Second, a multi-view learning should be used to encapsu-        Li, C.; Ye, J.; Ge, J.; Kong, L.; Hu, H.; and Luo, B. 2018. A
late the heterogeneous information of the target case. Differ-     novel convolutional neural network for statutes recommendation.
ent factors, e.g., the information from different parties, can     In Pacific Rim International Conference on Artificial Intelligence,
play different roles in the learning model, and more sophis-       851–863. Springer.
ticated representation learning algorithm should be applied        Mills, M. 2016. Artificial intelligence in law: The state of play
to address this challenge.                                         2016. Thomson Reuters Legal executive Institute.
   Third, different kinds of information should be projected       OECD. 2013. What makes civil justice effective? OECD Eco-
into two kinds of representation spaces, semantic space and        nomics Department Policy Notes (18).
legal knowledge space. For instance, both (Duan et al. 2019)       Zhou, X.; Zhang, Y.; Liu, X.; Sun, C.; and Si, L. 2019. Legal in-
and (Zhou et al. 2019) found legal knowledge (graph) can           telligence for e-commerce: Multi-task learning by leveraging mul-
be nontrivial for legal case representation, and the linguis-      tiview dispute representation. In Proceedings of the 42nd Interna-
tic features, like word, sentence, utterance (in debate), and      tional ACM SIGIR Conference on Research and Development in
sequential information should be projected into the legal          Information Retrieval, 315–324. ACM.
knowledge space to enhance the representation accuracy.