<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Legal Artificial Intelligence - Have You Lost a Piece from Jigsaw Puzzle?</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Alibaba Group</institution>
          ,
          <addr-line>Hangzhou, Zhejiang</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Changlong Sun</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Indiana University Bloomington</institution>
          ,
          <addr-line>Bloomington, Indiana</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Recent Legal Deep-AI Efforts with Different Stages</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>Legal artificial intelligence, as a special track in AI, is playing an increasingly important role to address different kinds of legal needs, and to provide vital potentials to help clients, lawyers and judges to access, understand, predict, and generate legal information in the context of legal domain knowledge. Legal AI, however, can be more challenging than other AI topics, and a comprehensive multi-view legal case representation, across different stages, can be essential for a number of downstream tasks, e.g., legal prediction, court debate mining, and legal QA/chatbot. In this paper, we explore the theoretical and methodological foundations, potentials and challenges to address this novel problem.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Over the past few years, a number of domains, like text
mining, computer vision, and auto-drive, have reaped the
benefits of embracing data-driven methods along with the
emerging deep learning models. These approaches simplify
systems while minimize the potential for humans to introduce
their own biases. More importantly, such enabling
technologies has been commercialized to satisfy various kinds of
needs from massive users. Legal domain AI, however, can
be more challenging and bewildering than other text
mining/NLP disciplines, and some studies even expressed the
concern that the exaggeration of AI in legal area backfired,
and machine should not step into this serious domain
        <xref ref-type="bibr" rid="ref3">(Mills
2016)</xref>
        . In this context, legal AI investigation can be critical
while such needs are both necessary and inevitable. For
instance, based on New York Times report, “Trial judges are
suffering from ‘daunting workload’1 is becoming an
increasingly critical issue, which challenges the efficiency of
legal justice ecosystem in different nations. According to the
report of statistics, the typical active federal district court
judge closed around 250 cases in a year, therefore, applying
novel artificial legal intelligence techniques to facilitate the
lawsuit process so as to alleviate the overwhelmed workload
of judges is of great significance
        <xref ref-type="bibr" rid="ref4">(OECD 2013)</xref>
        .
      </p>
      <p>In this study, we investigating the opportunities and
challenges in legal AI from case representation, learning model
and data availability perspectives. While a number of very
recent studies are explored and strategically reviewed, a new
learning framework, Joint Multi-Stage Case Representation
Learning (JMCRL), is proposed to characterize the
semantic, logic, and knowledge context of a legal case.
Unlike most existing topics in machine learning, a legal
case, or to say a legal information need from user, may
experience different stages. To the best of our knowledge, so
far, few studies dynamically explore the case representation
across different legal stages. Recently, however, deep
learning has been successfully investigated to address a number
of AI problems in each stage along, which can be
summarized as the followings.</p>
      <p>
        Case prior knowledge stage addresses the legal
case/information-need context characterization problem,
e.g., the relevant precedents, statutes and undisputed legal
concepts retrieval from the legal databases and characterize
the case context. The legal contextual information can
provide essential information to represent the target case/need.
For instance,
        <xref ref-type="bibr" rid="ref2">(Li et al. 2018)</xref>
        locate the relevant statuses from
DB, as the target case’s context, by using CNN plus
Correlation Matrix that could cope with ambiguity and variability
problems.
      </p>
      <p>
        In a pre-trial stage, the case indictment and evidences,
from plaintiff or defendant, can provide key information to
predict the legal decisions.
        <xref ref-type="bibr" rid="ref5">(Zhou et al. 2019)</xref>
        , for instance,
proposed a novel multitask learning model to represent the
case by using plaintiff (buyer), defendant (seller), and
indictment (dispute) information in an eCommece
ecosystem. More importantly, authors found the legal knowledge
(graph) can play an important role for case representation
learning, e.g., ablation test showed legal knowledge
enhanced case representation can improve the model
performance by 5%.
      </p>
      <p>
        In the trial court stage, different parties, like plaintiff,
defendant, judge and lawyer, have the chance to change the
sentence result and the associated case representations in a
court debate context. While the debate representation
learning can be more challenging, more recently,
        <xref ref-type="bibr" rid="ref1">(Duan et al.
2019)</xref>
        proposed a novel deep debate representation learning
framework. As the most interesting finding, authors proofed
the role information can be more important than legal
knowledge and case global information for debate mining. Various
types of information can all contribute to the learning tasks,
e.g., debate summarization.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Joint Multi-Stage Case Representation Learning (JMCRL)</title>
      <p>In the paper, we propose a novel legal case
representation learning framework by comprehensively integrating
four different stages: Case Prior Knowledge (learning)
stage, Pre-trial stage, Trial Court Stage, and final
Decision stage. Figure 1 depicts this model.</p>
      <p>It is clear, legal case representation learning, comparing
with other types of learning tasks, can be more challenging
because of the following reasons.</p>
      <p>First, the optimized legal data analytics/mining/prediction
solutions may need to explore the understanding of their
implications across different stages, and the case representation
could change significantly when the same factor transferring
from one stage to another. For instance, the ‘contract
assessment result’ (of the target case) may change from pre-trial
stage to trial court stage with the additional input from
plaintiff, and the prediction result could change correspondingly.</p>
      <p>Second, a multi-view learning should be used to
encapsulate the heterogeneous information of the target case.
Different factors, e.g., the information from different parties, can
play different roles in the learning model, and more
sophisticated representation learning algorithm should be applied
to address this challenge.</p>
      <p>
        Third, different kinds of information should be projected
into two kinds of representation spaces, semantic space and
legal knowledge space. For instance, both
        <xref ref-type="bibr" rid="ref1">(Duan et al. 2019)</xref>
        and
        <xref ref-type="bibr" rid="ref5">(Zhou et al. 2019)</xref>
        found legal knowledge (graph) can
be nontrivial for legal case representation, and the
linguistic features, like word, sentence, utterance (in debate), and
sequential information should be projected into the legal
knowledge space to enhance the representation accuracy.
      </p>
      <p>Last but not least, we can further enhance the performance
of the task(s) in the decision stage by leveraging multi-task
learning. In a legal context, the final decision may highly
likely associate different sub-tasks, e.g., related articles
prediction, penalty calculation, and reason generation. A
multitask framework can help the model better optimize the
representation parameters while enabling the communications
among the tasks, which have been proofed as an effective
means to enhance the legal AI jobs.</p>
      <p>Unfortunately, data barrier restricts JMCRL
investigations and implementations. While the case database, court
debate corpus, and legal knowledge graph are increasingly
available for legal AI research, no dataset can interconnect
them for cross-stage case representation learning. In the next
step, we will be working on this problem. Efforts will be
made to create novel dataset to enable future legal AI
studies.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Duan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ; Zhang, Y.;
          <string-name>
            <surname>Yuan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Zhang,
          <string-name>
            <given-names>Q.</given-names>
            ;
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ; and
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>Legal summarization for multi-role debate dialogue via controversy focus mining and multi-task learning</article-title>
          .
          <source>In Proceedings of the 28th ACM International Conference on Information and Knowledge Management</source>
          ,
          <fpage>1361</fpage>
          -
          <lpage>1370</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ye</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ge</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kong</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Hu</surname>
          </string-name>
          , H.; and
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>A novel convolutional neural network for statutes recommendation</article-title>
          .
          <source>In Pacific Rim International Conference on Artificial Intelligence</source>
          ,
          <fpage>851</fpage>
          -
          <lpage>863</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Mills</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Artificial intelligence in law: The state of play 2016</article-title>
          .
          <article-title>Thomson Reuters Legal executive Institute</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>OECD.</surname>
          </string-name>
          <year>2013</year>
          .
          <article-title>What makes civil justice effective?</article-title>
          <source>OECD Economics Department Policy Notes</source>
          (
          <volume>18</volume>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ; Zhang, Y.;
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Si</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <year>2019</year>
          .
          <article-title>Legal intelligence for e-commerce: Multi-task learning by leveraging multiview dispute representation</article-title>
          .
          <source>In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <fpage>315</fpage>
          -
          <lpage>324</lpage>
          . ACM.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>