Applying Argument Extraction to Improve Legal Information Retrieval

                                           Kevin D. Ashley
                                University of Pittsburgh School of Law
                                Pittsburgh, Pennsylvania, USA 15260
                                        ashley@pitt.edu


                     Abstract                           legal practice. A primary reason for this is the
                                                        well-known bottleneck in representing knowledge
     Argument extraction techniques can likely          from the legal texts (e.g., statutes, regulations, and
     improve legal information retrieval. Any           cases) that play such an important role in legal
     effort to achieve that goal should take            practice in a form so that the the computational
     into account key features of legal reason-         implementations can reason with them.
     ing such as the importance of legal rules             Meanwhile, legal information retrieval systems
     and concepts, support and attack relations         have proven to be highly functional. They pro-
     among claims, and citation of authoritative        vide legal practitioners with convenient access
     sources. Annotation types reflecting these         to millions of legal texts without relying on ar-
     key features will help identify the roles of       gument models or schemes, relying instead on
     textual elements in retrieved legal cases in       Bayesian statistical inference based on term fre-
     order to better inform assessments of rele-        quency. Users of legal information systems can
     vance for users’ queries. As a result, legal       submit queries in the form of a natural language
     argument models and argument schemes               description of a desired fact pattern and retrieve
     will likely play a central part in the text        numerous relevant cases.
     annotation type system.
                                                           Useful as they are, however, legal information
                                                        retrieval systems do not provide all of the func-
1    Introduction
                                                        tionality that practitioners could employ. What
With improved prospects for automatically ex-           IR system users often want “is not merely IR,
tracting arguments from text, we are investigat-        but AR”, that is, “argument retrieval: not merely
ing whether and how argument extraction can im-         sentences with highlighted terms, but arguments
prove legal information retrieval (IR). An immedi-      and argument-related information. For example,
ate question in that regard is the role that argument   users want to know what legal or factual issues the
models and argument schemes will play in achiev-        court decided, what evidence it considered rele-
ing this goal.                                          vant, what outcomes it reached, and what reasons
   For some time, researchers in Artificial Intelli-    it gave.” (Ashley and Walker, 2013a).
gence and Law have developed argument models,              Recently, IBM announced its Debater project,
formal and dialectical process models to describe       an argument construction engine which, given a
arguments and their relations. They have also           corpus of unstructured text like Wikipedia, can au-
implemented these models in computer programs           tomatically construct a set of relevant pro/con ar-
that construct legal arguments. Some of these           guments phrased in natural language. Built upon
models employ argument schemes to provide se-           the foundation of IBM’s Jeopardy-game-winning
mantics and describe reasonable arguments. Each         Watson question answering system, the advent of
scheme corresponds to a typical domain-specific         Debater raises some interesting related questions.
inference sanctioned by the argument, a kind of         A central hypothesis of the Watson project was
prima facie reason for believing the argument’s         to answer questions based on shallow syntactic
conclusion. See (Prakken, 2005, p. 234).                knowledge and its implied semantics. This was
   By and large, however, these argument models         preferred to formally represented deep semantic
and schemes and their computational implementa-         knowledge, the acquisition of which is difficult
tions have not had much of a practical effect on        and expensive (Fan et al., 2012). If Debater is
applied to legal domains (See, e.g.,(Beck, 2014)),         Factors, stereotypical fact patterns that
one wonders to what extent the same will be true        strengthen or weaken a side’s argument in a legal
of Debater. In particular, to what extent will ex-      claim, have been identified in text automatically.
plicit argumentation models and their schemes for       Using a HYPO-style CBR program and an IR
the legal domain be necessary or useful for the ef-     system relevance feedback module, the SPIRE
fort to extract legal arguments? And, can tech-         program retrieved legal cases from a text corpus
niques in Debater be adapted to improve legal IR?       and highlighted passages relevant to bankruptcy
                                                        law factors (Daniels and Rissland, 1997). The
2   Related Work                                        SMILE+IBP program learned to classify case
                                                        summaries in terms of applicable trade secret
The seminal work on extracting arguments and            law factors (Ashley and Brüninghaus, 2009),
argument-related information from legal case de-        analyzed automatically classified squibs of new
cisions is (Mochales and Moens, 2011). Opera-           cases, predicted outcomes, and explained the
tionally, the authors defined an argument as “a set     predictions. (Wyner and Peters, 2010) presents a
of propositions, all of which are premises except,      scheme for annotating 39 trade secret case texts
at most, one, which is a conclusion. Any argument       with GATE in terms of finer grained components
follows an argumentation scheme. . . .” Using ma-       (i.e., factoroids) of a selection of factors.
chine learning based on manually classified sen-           Using an argument model to assist in represent-
tences from the Araucaria corpus, including court       ing cases for conceptual legal information retrieval
reports, they achieved good performance on clas-        was explored in (Dick and Hirst, 1991). More re-
sifying sentences as propositions in arguments or       cently, other researchers have addressed automatic
not and classifying argumentative propositions as       semantic processing of case decision texts for le-
premises or conclusions. Given a limited set of         gal IR, achieving some success in automatically:
documents, their manually-constructed rule-based
argument grammar also generated argument tree             • assigning rhetorical roles to case sentences
structures (Mochales and Moens, 2011).                      based on 200 manually annotated Indian de-
   In identifying argumentative propositions,               cisions (Saravanan and Ravindran, 2010),
Mochales and Moens achieved accuracies of 73%
and 80% on two corpora, employing domain-                 • categorizing legal cases by abstract West-
general features (including, e.g., each word, pairs         law categories (e.g., bankruptcy, finance and
of words, pairs and triples of successive words,            banking) (Thompson, 2001) or general top-
parts of speech including adverbs, verbs, modal             ics (e.g., exceptional services pension, retire-
auxiliaries, punctuation, keywords indicating               ment) (Gonçalves and Quaresma, 2005),
argumentation, parse tree depth and number of
subclauses, and certain text statistics.) For classi-     • extracting treatment history (e.g., “affirmed”,
fying argumentative propositions as premises or             “reversed in part”) (Jackson et al., 2003),
conclusions, their features included the sentence’s
length and position in the document, tense and            • determining the role of a sentence in the legal
type of main verb, previous and successive                  case (e.g., as describing the applicable law or
sentences’ categories, a preprocessing classifi-            the facts) (Hachey and Grover, 2006),
cation as argumentative or not, and the type of
rhetorical patterns occurring in the sentence and         • extracting offenses raised and legal principles
surrounding sentences (i.e., Support, Against,              applied from criminal cases to generate sum-
Conclusion, Other or None). Additional features,            maries (Uyttendaele et al., 1998),
more particular to the legal domain included
whether the sentence referred to or defined a legal       • extracting case holdings (McCarty, 2007),
article, the presence of certain argumentative              and
patterns (e.g. “see”, “mutatis mutandis”, “having
reached this conclusion”, “by a majority”) and            • extracting argument schemes from the Arau-
whether the agent of the sentence is the plaintiff,         caria corpus such as argument from example
the defendant, the court or other (Mochales and             and argument from cause to effect (Feng and
Moens, 2011).                                               Hirst, 2011).
   We aim to develop and evaluate an integrated           (7) “constructed a demo speech with top claim
approach using both semantic and pragmatic (con-       predictions”, and
textual) information to retrieve arguments from le-       (8) was then “ready to deliver!”
gal texts in order to improve legal information re-       Figure 1 shows an argument diagram con-
trieval. We are working with an underlying ar-         structed manually from the video recording of De-
gumentation model and its schemes, the Default         bater’s oral output for the example topic.
Logic Framework (DLF), and a corpus of U.S.
Federal Claims Court cases (Walker et al., 2011;       3     Key Elements of Legal Argument
Walker et al., 2014; Ashley and Walker, 2013a).
Like (Mochales and Moens, 2011) and (Sergeant,         Debater’s argument regarding banning violent
2013), we plan to:                                     video games is meaningful but compare it to the
                                                       legal argument concerning a similar topic in Fig-
  1. Train an annotator to automatically identify      ure 2. The Court in Video Software Dealers As-
     propositions in unseen legal case texts,          soc. v. Schwarzenegger, 556 F. 3d 950 (9th
                                                       Cir. 2009), addressed the issue of whether Cali-
  2. Distinguish argumentative from non-               fornia (CA ) Civil Code sections 1746-1746.5 (the
     argumentative propositions and classify them      “Act”), which restrict sale or rental of “violent
     as premises or conclusions,                       video games” to minors, were unconstitutional un-
                                                       der the 1st and 14th Amendments of the U.S. Con-
  3. Employ rule-based or machine learning mod-
                                                       stitution. The Court held the Act unconstitutional.
     els to construct argument trees from unseen
                                                       As a presumptively invalid content-based restric-
     cases based on a manually annotated training
                                                       tion on speech, the Act is subject to strict scrutiny
     corpus, but also to
                                                       and the State has not demonstrated a compelling
  4. Use argument trees to improve legal informa-      interest.
     tion retrieval reflecting the uses of proposi-        In particular, the Court held that CA had not
     tions in arguments.                               demonstrated a compelling government interest
                                                       that “the sale of violent video games to minors
   Before sketching our approach for the legal         should be banned.” Figure 2 shows excerpts from
domain, however, we note that IBM appears to           the portion of the opinion in which the Court jus-
have developed more domain independent tech-           tifies this conclusion. The nodes contain propo-
niques for identifying propositions in documents       sitions from that portion and the arcs reflect the
and classifying them as premises in its Debater        explicit or implied relations among those proposi-
system.1                                               tions based on a fair reading of the text.
   On any topic, the Debater’s task is to “detect          The callout boxes in Figure 2 highlight some
relevant claims” and return its “top predictions for   key features of legal argument illustrated in the
pro claims and con claims.” On inputting the topic,    Court’s argument:
“The sale of violent videogames to minors should
be banned,” for example, Debater:                          1. Legal rules and concepts govern a court’s de-
   (1) scanned 4 million Wikipedia articles,                  cision of an issue.
   (2) returned the 10 most relevant articles,
   (3) scanned the 3000 sentences in those 10 arti-        2. Standards of proof govern a court’s assess-
cles,                                                         ment of evidence.
   (4) detected those sentences that contained
                                                           3. Claims have support / attack relations.
“candidate claims”,
   (5) “identified borders of candidate claims”,           4. Authorities are cited (e.g., cases, statutes).
   (6) “assessed pro and con polarity of candidate
claims”,                                                   5. Attribution information signals or affects
   1
    See, e.g., http://finance.yahoo.com/blogs/                judgments about belief in an argument (e.g.,
the-exchange/ibm-unveils-a-computer-                          “the State relies”).
than-can-argue-181228620.html. A demo ap-
pears at the 45 minute mark: http://io9.com/ibms-
watson-can-now-debate-its-opponents-                       6. Candidate claims in a legal document have
1571837847.                                                   different plausibility.
                               The	
  sale	
  of	
  violent	
  videogames	
  to	
  minors	
  should	
  be	
  banned.	
  

    Pro:	
  	
  Exposure	
  to	
  violent	
                                           Con:	
  On	
  the	
  other	
  hand,	
  I	
  would	
  like	
  to	
  
    videogames	
  results	
  in	
  increased	
                                        note	
  the	
  following	
  claims	
  that	
  oppose	
  
    physiological	
  arousal,	
  aggression-­‐                                        the	
  topic.	
  Violence	
  in	
  videogames	
  is	
  
    related	
  thoughts	
  and	
  feelings,	
  as	
                                   not	
  causally	
  linked	
  with	
  aggressive	
  
    well	
  as	
  decreased	
  pro-­‐social	
                                         tendencies.	
  	
  
    behavior.	
  

    Pro:	
  In	
  addiAon	
  these	
  violent	
  games	
  or	
                    Con:	
  In	
  addiAon,	
  most	
  children	
  who	
  play	
  
    lyrics	
  actually	
  cause	
  adolescents	
  to	
                            violent	
  videogames	
  do	
  not	
  have	
  
    commit	
  acts	
  of	
  real	
  life	
  aggression.	
                         problems	
  

    Pro:	
  Finally,	
  violent	
  video	
  games	
  can	
                   Con:	
  Finally,	
  video	
  game	
  play	
  is	
  part	
  of	
  an	
  
    increase	
  children’s	
  aggression.	
                                  adolescent	
  boy’s	
  normal	
  social	
  seDng.	
  

     Figure 1: Argument Diagram of IBM Debater’s Output for Violent Video Games Topic (root node)


   Although the argument diagrams in Figures 1                              “Special Masters” concerning whether claimants’
and 2 address nearly the same topic and share sim-                          compensation claims comply with the require-
ilar propositions, the former obviously lacks these                         ments of a federal statute establishing the National
features that would be important in legal argument                          Vaccine Injury Compensation Program. Under the
(and, as argued later, important in using extracted                         Act, a claimant may obtain compensation if and
arguments to improve legal IR). Of course, on one                           only if the vaccine caused the injury.
level this is not surprising; the Debater argument                             In order to establish causation under the rule
is not and does not purport to be a legal argument.                         of Althen v. Secr. of Health and Human Ser-
   On the other hand, given the possibility of ap-                          vices, 418 F.3d 1274 (Fed.Cir. 2005), the peti-
plying Debater to legal applications and argumen-                           tioner must establish by a preponderance of the
tation, it would seem essential that it be able to                          evidence that: (1) a “medical theory causally con-
extract such key information. In that case, the                             nects” the type of vaccine with the type of injury,
question is the extent to which explicit argument                           (2) there was a “logical sequence of cause and ef-
models and argument schemes of legal reasoning                              fect” between the particular vaccination and the
would be useful in order to assist with the extrac-                         particular injury, and (3) a “proximate temporal
tion of the concepts, relationships, and informa-                           relationship” existed between the vaccination and
tion enumerated above and illustrated in Figure 2.                          the injury. Walker’s corpus comprises all deci-
                                                                            sions in a 2-year period applying the Althen test of
4      Default-Logic Framework                                              causation-in-fact (35 decision texts, 15-40 pages
Vern Walker’s Default Logic Framework (DLF)                                 per decision). In these cases, the Special Masters
is an argument model plus schemes for evidence-                             decide which evidence is relevant to which issues
based legal arguments concerning compliance                                 of fact, evaluate the plausibility of evidence in the
with legal rules. At the Research Laboratory for                            legal record, organize evidence and draw reason-
Law, Logic and Technology (LLT Lab) at Hofs-                                able inferences, and make findings of fact.
tra University, researchers have applied the DLF to                          The DLF model of a single case “integrates nu-
model legal decisions by Court of Federal Claims                            merous units of reasoning” each “consisting of one
                                                                1.	
  rule	
  and	
               2.	
  standard	
  
                                                              legal	
  concepts	
  	
               of	
  proof	
  

       5.	
  a8ribu9on	
  
               info	
  

       3.	
  support	
  /	
  
      a8ack	
  rela9ons	
  

                                                                                    6.	
  plausibility	
  

   Figure 2: Diagram Representing Realistic Legal Argument Involving Violent Video Games Topic


conclusion and one or more immediately support-        terarguments, (4) citation to the statute, 42 USC
ing reasons (premises)” and employing four types       300aa-11(c)(1)(C)(ii)), and to the Althen and Shy-
of connectives (min (and), max (or), evidence fac-     face case authorities, (5) some attribution informa-
tors, and rebut) (Walker et al., 2014). For example,   tion that signals judgments about the Special Mas-
Figure 3 shows an argument diagram representing        ter’s belief in an argument (e.g., “Dr. Kinsbourne
the excerpt of the the DLF model of the special        and Dr. Kohrman agree”), and (6) four factors that
master’s finding in the case of Cusati v. Secretary    increase plausibility of the claim of causation.
of Health and Human Services, No. 99-0492V
(Office of Special Masters, United States Court        5   Legal Argument and Legal IR
of Federal Claims, September 22, 2005) concern-
                                                       Legal decisions contain propositions and argu-
ing whether the first Althen condition for showing
                                                       ments how to “prove” them. Prior cases provide
causation-in-fact is satisfied.
                                                       examples of how to make particular arguments in
   The main point is that the DLF model of a le-       support of similar hypotheses and of kinds of ar-
gal argument and its argument schemes represent        guments that have succeeded, or failed, in the past.
the above-enumerated key features of legal argu-       Consider a simple query discussed in (Ashley and
ment. As illustrated in the callout boxes of Figure    Walker, 2013a): Q1: “MMR vaccine can cause in-
3, the model indicates: (1) the 1st Althen rule and    tractable seizure disorder and death.”
causation-in-fact concept that govern the decision        An attorney/user in a new case where an injury
of the causation issue, (2) the preponderance of ev-   followed an MMR vaccination might employ this
idence standard of proof governing the court’s as-     query to search for cases where such propositions
sessment, (3) support relations among the proposi-     had been addressed. Relevant cases would add
tions, the Special Master having recorded no coun-     confidence that the propositions and accompany-
           1.	
  rule	
  and	
  
                                                                                                             2.	
  standard	
                           5.	
  aYribu7on	
  
                  legal	
                                                                                                                                                          FACTOR	
  [1	
  of	
  4]	
  :	
  "MMR	
  
                                                                                                               of	
  proof	
                                    info	
             vaccine	
  causes	
  fever."	
  
            concepts	
  	
                           AND	
  [1	
  of	
  2]	
  :	
  The	
  injury	
  of	
                                                                           Dr.	
  Kinsbourne	
  and	
  Dr.	
  
                                                     Eric	
  Fernandez	
  "was	
  [or	
                                                                                            Kohrman	
  agree	
  that	
  
                                                     were]	
  caused	
  by"	
  the	
  MMR	
                                                                                        MMR	
  vaccine	
  causes	
  
                                                     vaccine	
  received	
  in	
  the	
                                                                                            fever.	
  
                                                     vaccina=on	
  on	
  November	
  
                                                     5,	
  1996	
  (42	
  USC	
  
                                                     300aa-­‐11(c)(1)(C)(ii)).	
  	
  	
                                                                                                                                                                                   FACTOR	
  [2	
  of	
  
                                                                                                                                                                          Q1	
     4]	
  :"[F]ever	
  causes	
  
                                                                                                             AND	
  [1	
  of	
  3]	
  :	
  (1)	
                                   seizures."	
  Dr.	
  
                                                                                                                                                        "MMR	
  vaccine	
          Kinsbourne	
  and	
  Dr.	
  
                                                                                                             A	
  “medical	
  
                                                                                                                                                        causes	
  fever"	
         Kohrman	
  agree	
  that	
  
      OR	
  [2	
  of	
  2]	
  :	
  OFF-­‐TABLE	
                                                             theory	
  causally	
  
                                                                                                                                                        and	
  "fever	
            fever	
  causes	
  seizures.	
  
      INJURY:	
  The	
  "causa=on-­‐                                                                         connect[s]”	
  the	
  
                                                                                                                                                        causes	
  
      in-­‐fact"	
  condi=on	
  is	
                                                                         vaccina7on	
  on	
  
                                                                                                                                                        seizures."	
  "Ms.	
  
      sa=sﬁed	
  (Althen,	
  418	
  F.                                                                       11-­‐5-­‐96	
  and	
  an	
  
                                                                                                                                                        Cusa7	
  has	
  
      3d	
  at	
  1278,	
  1281).	
  	
                                                                      intractable	
  
                                                                                                                                                        provided	
  more	
  
                                                                                                             seizure	
  disorder	
                                                 FACTOR	
  [3	
  of	
  4]	
  :"[A]	
  
                                                                                                                                                        than	
  
                                                                                                             and	
  death	
                                                        child	
  who	
  suﬀers	
  a	
  
                                                                                                                                                        preponderant	
  
                                                                                                             (Althen,	
  418	
  F.3d	
                                             complex	
  febrile	
  seizure	
  
                                                                                                                                                        evidence".	
  	
  
                                                                                                             at	
  1278).	
  	
                                                    has	
  a	
  greater	
  chance	
  of	
  
                                                                                                                                                                                   developing	
  epilepsy.”	
  

                                                               the	
  MMR	
  vaccine	
  was	
  "not	
  only	
  a	
  
                                                               but-­‐for	
  cause"	
  of	
  an	
  intractable	
                                       6.	
  plausibility	
  
        3.	
  support	
  /	
                                   seizure	
  disorder	
  and	
  death,	
  "but	
                                                                      FACTOR	
  [4	
  of	
  4]	
  :	
  "[T]he	
  
                                                               also	
  a	
  substan=al	
  factor	
  in	
                                                                           medical	
  literature	
  ...	
  
       aYack	
  rela7ons	
                                     bringing	
  about"	
  an	
  intractable	
                                                                           do[es]	
  not	
  assist	
  the	
  
          (no	
  aYacks	
  here)	
                             seizure	
  disorder	
  and	
  death	
                                    4.	
  cita7on	
  of	
                      special	
  master	
  in	
  
                                                               (Shyface,	
  165	
  F.3d	
  at	
  1352-­‐53;	
                            authori7es	
                              evalua7ng	
  Ms.	
  Cusa7's	
  
                                                               Althen,	
  418	
  F.3d	
  at	
  1278).	
  	
                                                                        'legal	
  cause'	
  claim."	
  

 Figure 3: Diagram of DLF Model of Special Master’s Finding in Cusati Case re 1st Althen Condition


ing arguments were reasonable and had been suc-                                                                                           client sustained seizures after receiving the MMR
cessful.                                                                                                                                  vaccine probably knows that he/she will have to
   Importantly, the cases retrieved will be more                                                                                          satisfy a requirement of causation. The attorney
relevant to the extent that the proposition is used in                                                                                    may not know, however, what legal standard de-
a similar argument. That is, they will be more rel-                                                                                       fines the relevant concept of causation or what
evant to the extent that the proposition plays roles                                                                                      legal authority may be cited as an authoritative
in the case arguments similar to the role in which                                                                                        source of the standard. In that situation, retrieved
the attorney intends to use it in an argument about                                                                                       cases will likely be more relevant to the extent that
the current case.                                                                                                                         that they fill in the legal rule-oriented direction,
                                                                                                                                          relative to a proposition similar to the one marked
   An argument diagram like that of Figure 3 can
                                                                                                                                          “Q1”, with legal rules about the concept of causa-
illustrate the effect of the six key elements of le-
                                                                                                                                          tion and citations to their authoritative sources.
gal reasoning illustrated above on how relevant a
retrieved case is to a user’s query. The diagram                                                                                             If the attorney is unsure of the kinds of evidence
shows a legal argument in which the proposition                                                                                           that an advocate should employ in convincing a
corresponding to Q1 plays a role in the Cusati case                                                                                       Special Master to make the finding of fact on cau-
as an evidence-based finding of the Special Mas-                                                                                          sation or of the relevant standard of proof for as-
ter, namely, that “MMR vaccine causes fever” and                                                                                          sessing that evidence of causation, retrieved cases
“fever causes seizures.”                                                                                                                  will be more relevant to the extent that they fill in
   Such diagrams have a “legal rule-oriented” di-                                                                                         the evidentiary factors-oriented direction, relative
rection (i.e., to the left in Figure 3) and an “eviden-                                                                                   to a proposition similar to the one marked “Q1”,
tiary factors-oriented” direction (i.e., to the right                                                                                     with evidentiary factors and an identification of
in this diagram). For instance, an attorney whose                                                                                         the standard of proof.
   The attorney may be interested in better un-         Evidence: sentences that describe any type of
derstanding how to improve the plausibility of a            evidence legally produced in the particular
proposition about causation as an evidence-based            case being litigated, as part of the proof in-
finding. Cases will be more relevant to the extent          tended to persuade the trier-of-fact of alleged
that they contain evidentiary factors that support          facts material to the case (e.g., oral testimony
such a finding. An attorney interested in attack-           of witnesses, including experts on technical
ing the plausibility of the evidence-based finding          matters; documents, public records, deposi-
might be especially interested in seeing cases in-          tions; objects and photographs)
volving examples of evidentiary factors that attack
such a finding.                                         Citation: sentences that credit and refer to au-
   Finally, the cases will be more relevant to               thoritative documents and sources (e.g., court
the extent that the proposition similar to the one           decisions (cases), statutes, regulations, gov-
marked “Q1” concerning MMR vaccine’s causing                 ernment documents, treaties, scholarly writ-
injury is attributable to the Special Master as op-          ing, evidentiary documents)
posed merely to some expert witness’s statement.
                                                           In the “text”, “concept”, and “citation” slots of
6   Specifying/Determining Propositions’                the appropriate nodes of the query input diagram,
    Argument Roles                                      Figure 4, users could specify the propositions,
                                                        concepts, or citations that they know or assume
The importance of a proposition’s argument role         and check the targeted nodes in the directions
in matching retrieved cases to users’ queries raises    (rule-oriented or evidentiary-factors-oriented) or
two questions: (1) How does the user specify the        ranges that they hope to fill through searching for
target propositions and their argumentative roles       cases whose texts satisfy the diagram’s argument-
in which he is interested? (2) How does a pro-          related constraints. In effect, the diagram will
gram determine the roles that propositions play in      guide the IR system in ranking the retrieved cases
retrieved case arguments?                               for relevance and in highlighting their relevant
   An argument diagram like that of Figure 3 may        parts.
play a role in enabling users to specify the argu-         Regarding the second question, concerning how
ments and propositions in which they are inter-         a program will determine propositions’ argument
ested. One can imagine a user’s inputting a query       roles in case texts, that is the third task that
by employing a more abstract version of such a di-      Mochales and Moens addressed with a rule-based
agram. For instance, in the Query Input Diagram         grammar applied to a small set of documents.
of Figure 4, the nodes are labeled with, or refer to,   While their rules employed some features partic-
argument roles. These roles include:                    ular to legal argument, (e.g., whether a sentence
                                                        referred to a legal article) one imagines that ad-
Legal Rule: sentences that state a legal rule in the    ditional features would be needed, pertaining to
    abstract, without applying the rule to the par-     legal argument or to the regulated domain of in-
    ticular case being litigated                        terest. These features would become the predi-
                                                        cates of additional grammar rules or be annotated
Ruling/Holding: sentences that apply a legal rule
                                                        in training cases for purposes of machine learning.
    to decide issues presented in the particular
                                                           The legal argument roles listed above are a first
    case being litigated
                                                        cut at a more comprehensive enumeration of the
Evidence-Based Finding: sentences that report           types of legal argument features with which to an-
    a trier-of-fact’s ultimate findings regarding       notate legal case texts in an Unstructured Infor-
    facts material to the particular case being lit-    mation Management Architecture (UIMA) anno-
    igated                                              tation pipeline for purposes of extracting argument
                                                        information and improving legal IR.
Evidence-Based Reasoning: sentences that re-               UIMA, an open-source Apache framework, has
    port the trier-of-fact’s reasoning in assessing     been deployed in several large-scale government-
    the relevant evidence and reaching findings         sponsored and commercial text processing appli-
    regarding facts material to the particular case     cations, most notably, IBM’s Watson question an-
    being litigated (e.g., evidentiary factors)         swering system (Epstein et al., 2012). A UIMA
                                                                                                                         ✔	
  
                                                                                                  Evidence-­‐
                            ✔	
                          ✔	
                          ✔	
         Based	
  Finding	
             Evidence-­‐
                                                                                                  	
                             Based	
  
      Primary	
  Legal	
            Secondary	
                  Ruling/Holding	
                 	
                                             Evidence	
  
      Rules	
                       Legal	
  Rules	
             	
                               text:	
  “MMR	
  vaccine	
     Reasoning	
     	
  
                               can	
  cause	
  in-­‐	
        	
              	
                                    concepts:	
  causa/on	
      concepts:	
  causa/on	
          tractable	
  seizure	
  
      concepts:	
  causa/on	
  
                                    cita,ons:	
                  cita,ons:	
                      disorder	
  and	
  
                                                                                                                                 	
              concepts:	
  
      cita,ons:	
                                                                                                                concepts:	
     	
  
                               death.”	
  
                                                                                                  concepts:	
  causa/on	
  
                                                                                                  	
                                                       Figure 4: Sample Query Input Diagram


pipeline is an assemblage of integrated text anno-                                                       source credibility to resolve evidentiary dis-
tators. The annotators are “a scalable set of coop-                                                      crepancies (e.g., in terms of expert vs. expert
erating software programs, . . . , which assign se-                                                      or of adequacy of explanation) (Walker et al.,
mantics to some region of text” (Ferrucci, 2012),                                                        2014) .
and “analyze text and produce annotations or as-
sertions about the text” (Ferrucci et al., 2010, p.                                              If we succeed in designing a system of coordi-
74).                                                                                          nated legal annotation types and operationalizing
   A coordinated type system serves as the basis                                              a UIMA annotation pipeline, we envision adding
of communication among these annotators; a type                                               a module to a full-text legal IR system. At re-
system embodies a formalization of the annota-                                                trieval time it would extract semantic / pragmatic
tors’ analysis input and output data (Epstein et al.,                                         legal information from the top n cases returned by
2012, p. 3). In (Ashley and Walker, 2013b) and                                                a traditional IR search and re-rank returned cases
(Ashley and Walker, 2013a) the authors elaborate                                              to reflect the user’s diagrammatically specified ar-
three additional bases for annotations, which, with                                           gument need. The module would also summa-
further refinement, may serve as a conceptual sub-                                            rize highly ranked cases and highlight argument-
strate for the annotation types listed above:                                                 related information (Ashley and Walker, 2013a).
                                                                                              Since the module processes the texts of cases re-
  1. DLF annotations, as suggested in Figure 3,
                                                                                              turned by the information retrieval system, no spe-
     capture “(i) the applicable statutory and reg-
                                                                                              cial knowledge representation of the cases in the
     ulatory requirements as a tree of authoritative
                                                                                              IR system database is required; the knowledge
     rule conditions (i.e., a “rule tree”) and (ii) the
                                                                                              representation bottleneck will have been circum-
     chains of reasoning in the legal decision that
                                                                                              vented.
     connect evidentiary assertions to the special
     master’s findings of fact on those rule condi-                                           7          Conclusion
     tions (Walker et al., 2011).”
                                                                                              According to Wittgenstein, meaning lies in the
  2. Annotations in terms of presuppositional in-
                                                                                              way knowledge is used. Legal argument models
     formation that “identifies entities (e.g., types
                                                                                              and argument schemes can specify roles for legal
     of vaccines or injuries), events (e.g., date of
                                                                                              propositions to play (and, interestingly, Stephen
     vaccination or onset of symptoms) and re-
                                                                                              Toulmin was a student of Wittgenstein.) Thus, re-
     lations among them used in vaccine deci-
                                                                                              searchers can enable machines to search for and
     sions to state testimony about causation, as-
                                                                                              use legal knowledge intelligently in order, among
     sessments of probative value, and findings of
                                                                                              other things, to improve legal information re-
     fact.” (Ashley and Walker, 2013a).
                                                                                              trieval.
  3. Annotations of of argument patterns based                                                   Although IBM Debater may identify argu-
     on: inference type (e.g., deductive or statisti-                                         ment propositions (e.g., claims), legal argument
     cal), evidence type (e.g., legal precedent, pol-                                         schemes could help it to address legal rules and
     icy, fact testimony), or type of weighing of                                             concepts, standards of proof, internal support and
attack relations, citation of statutory and case au-          2010. Building Watson: An overview of the
thorities, attribution, and plausibility. Open ques-          DeepQA project. AI Magazine, 31(3):59–79.
tions include the extent to which legal expert              D. Ferrucci. 2012. Introduction to ”This is Watson”.
knowledge will be needed in order to operational-             IBM J. Res. and Dev., 56(3.4):1–1.
ize argument schemes to extract arguments from
                                                            T. Gonçalves and P. Quaresma. 2005. Is linguistic
legal case texts.                                              information relevant for the classification of legal
                                                               texts? In Proc. 10th Int’l Conf. on AI and Law,
Acknowledgments                                                ICAIL ’05, pages 168–176, NY, NY. ACM.
My colleagues Vern Walker, Matthias Grabmair,               B. Hachey and C. Grover. 2006. Extractive summari-
and Eric Nyberg make this work possible.                       sation of legal texts. Artificial Intelligence and Law,
                                                               14(4):305–345.
                                                            P. Jackson, K. Al-Kofahi, A. Tyrrell, and A. Vachher.
References                                                     2003. Information extraction from case law and re-
K. Ashley and S. Brüninghaus. 2009. Automatically             trieval of prior cases. Artificial Intelligence, 150(1-
  classifying case texts and predicting outcomes. Ar-          2):239–290, November.
  tificial Intelligence and Law, pages 125–165.             L.T. McCarty. 2007. Deep semantic interpretations of
K. Ashley and V. Walker. 2013a. From information              legal texts. In Proc. 11th Int’l Conf. on AI and Law,
  retrieval (IR) to argument retrieval (AR) for legal         ICAIL ’07, pages 217–224, NY, NY. ACM.
  cases: Report on a baseline study. In K. Ashley,          R. Mochales and M.-F. Moens. 2011. Argumentation
  editor, JURIX, volume 259 of Frontiers in Artifi-            mining. Artificial Intelligence and Law, 19(1):1–22.
  cial Intelligence and Applications, pages 29–38. IOS
  Press.                                                    H. Prakken. 2005. AI & Law, logic and argument
                                                              schemes. Argumentation, 19(3):303–320.
K. Ashley and V. Walker. 2013b. Toward constructing
  evidence-based legal arguments using legal decision       M. Saravanan and B. Ravindran. 2010. Identification
  documents and machine learning. In Proc. 14th Int’l         of rhetorical roles for segmentation and summariza-
  Conf. on Artificial Intelligence and Law, ICAIL ’13,        tion of a legal judgment. Artificial Intelligence and
  pages 176–180, New York, NY, USA. ACM.                      Law, 18(1):45–76.
S. Beck. 2014. Emerging technology shapes future            A. Sergeant. 2013. Automatic argumentation extrac-
   of law.     http://www.americanlawyer.                     tion. In et al. P. Cimiano, editor, ESWC, volume
   com/id=1202664266769/Emerging-                             7882 of Lecture Notes in Computer Science, pages
   Technology-Shapes-Future-of-Law.                           656–660. Springer.
   Accessed: 2014-09-20.
                                                            P. Thompson. 2001. Automatic categorization of case
J. Daniels and E. Rissland. 1997. Finding legally rel-         law. In Proc. 8th Int’l Conf. on AI and Law, ICAIL
   evant passages in case opinions. In ICAIL, pages            ’01, pages 70–77, NY, NY. ACM.
   39–46.
                                                            C. Uyttendaele, M.-F. Moens, and J. Dumortier. 1998.
J. Dick and G. Hirst. 1991. A case-based represen-             Salomon: Automatic abstracting of legal cases for
   tation of legal text for conceptual retrieval. In Pro-      effective access to court decisions. Artificial Intelli-
   ceedings, Workshop on Language and Information              gence and Law, 6(1):59–79.
   Processing, American Society for Information Sci-
   ence, pages 93–102.                                      V. Walker, N. Carie, C. DeWitt, and E. Lesh. 2011.
                                                               A framework for the extraction and modeling of
EA Epstein, MI Schor, BS Iyer, A. Lally, EW Brown,             fact-finding reasoning from legal decisions: Lessons
  and J. Cwiklik. 2012. Making Watson fast. IBM J.             from the vaccine/injury project corpus. Artificial In-
  Res. and Dev., 56(3.4):15–1.                                 telligence and Law, pages 291–331.

J. Fan, A. Kalyanpur, DC Gondek, and DA Ferrucci.           V. Walker, K. Vazirova, and C. Sanford. 2014. Anno-
   2012. Automatic knowledge extraction from docu-             tating patterns of reasoning about medical theories
   ments. IBM J. Res. and Dev., 56(3.4):5–1.                   of causation in vaccine cases: Toward a type system
                                                               for arguments. In Proc. 1st Workshop on Argumen-
V. Feng and G. Hirst. 2011. Classifying arguments              tation Mining, ACL 2014.
   by scheme. In Dekang Lin, Yuji Matsumoto, and
   Rada Mihalcea, editors, ACL, pages 987–996. The          A. Wyner and W. Peters. 2010. Lexical semantics
   Association for Computer Linguistics.                      and expert legal knowledge towards the identifica-
                                                              tion of legal case factors. In Proc. 23d Conf. on
D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan,                Legal Knowledge and Information Systems: JURIX
  D. Gondek, A. Kalyanpur, A. Lally, J. W. Murdock,           2010, pages 127–136, Amsterdam. IOS Press.
  E. Nyberg, J. Prager, N. Schlaefer, and C. Welty.