Applying Argument Extraction to Improve Legal Information Retrieval Kevin D. Ashley University of Pittsburgh School of Law Pittsburgh, Pennsylvania, USA 15260 ashley@pitt.edu Abstract legal practice. A primary reason for this is the well-known bottleneck in representing knowledge Argument extraction techniques can likely from the legal texts (e.g., statutes, regulations, and improve legal information retrieval. Any cases) that play such an important role in legal effort to achieve that goal should take practice in a form so that the the computational into account key features of legal reason- implementations can reason with them. ing such as the importance of legal rules Meanwhile, legal information retrieval systems and concepts, support and attack relations have proven to be highly functional. They pro- among claims, and citation of authoritative vide legal practitioners with convenient access sources. Annotation types reflecting these to millions of legal texts without relying on ar- key features will help identify the roles of gument models or schemes, relying instead on textual elements in retrieved legal cases in Bayesian statistical inference based on term fre- order to better inform assessments of rele- quency. Users of legal information systems can vance for users’ queries. As a result, legal submit queries in the form of a natural language argument models and argument schemes description of a desired fact pattern and retrieve will likely play a central part in the text numerous relevant cases. annotation type system. Useful as they are, however, legal information retrieval systems do not provide all of the func- 1 Introduction tionality that practitioners could employ. What With improved prospects for automatically ex- IR system users often want “is not merely IR, tracting arguments from text, we are investigat- but AR”, that is, “argument retrieval: not merely ing whether and how argument extraction can im- sentences with highlighted terms, but arguments prove legal information retrieval (IR). An immedi- and argument-related information. For example, ate question in that regard is the role that argument users want to know what legal or factual issues the models and argument schemes will play in achiev- court decided, what evidence it considered rele- ing this goal. vant, what outcomes it reached, and what reasons For some time, researchers in Artificial Intelli- it gave.” (Ashley and Walker, 2013a). gence and Law have developed argument models, Recently, IBM announced its Debater project, formal and dialectical process models to describe an argument construction engine which, given a arguments and their relations. They have also corpus of unstructured text like Wikipedia, can au- implemented these models in computer programs tomatically construct a set of relevant pro/con ar- that construct legal arguments. Some of these guments phrased in natural language. Built upon models employ argument schemes to provide se- the foundation of IBM’s Jeopardy-game-winning mantics and describe reasonable arguments. Each Watson question answering system, the advent of scheme corresponds to a typical domain-specific Debater raises some interesting related questions. inference sanctioned by the argument, a kind of A central hypothesis of the Watson project was prima facie reason for believing the argument’s to answer questions based on shallow syntactic conclusion. See (Prakken, 2005, p. 234). knowledge and its implied semantics. This was By and large, however, these argument models preferred to formally represented deep semantic and schemes and their computational implementa- knowledge, the acquisition of which is difficult tions have not had much of a practical effect on and expensive (Fan et al., 2012). If Debater is applied to legal domains (See, e.g.,(Beck, 2014)), Factors, stereotypical fact patterns that one wonders to what extent the same will be true strengthen or weaken a side’s argument in a legal of Debater. In particular, to what extent will ex- claim, have been identified in text automatically. plicit argumentation models and their schemes for Using a HYPO-style CBR program and an IR the legal domain be necessary or useful for the ef- system relevance feedback module, the SPIRE fort to extract legal arguments? And, can tech- program retrieved legal cases from a text corpus niques in Debater be adapted to improve legal IR? and highlighted passages relevant to bankruptcy law factors (Daniels and Rissland, 1997). The 2 Related Work SMILE+IBP program learned to classify case summaries in terms of applicable trade secret The seminal work on extracting arguments and law factors (Ashley and Brüninghaus, 2009), argument-related information from legal case de- analyzed automatically classified squibs of new cisions is (Mochales and Moens, 2011). Opera- cases, predicted outcomes, and explained the tionally, the authors defined an argument as “a set predictions. (Wyner and Peters, 2010) presents a of propositions, all of which are premises except, scheme for annotating 39 trade secret case texts at most, one, which is a conclusion. Any argument with GATE in terms of finer grained components follows an argumentation scheme. . . .” Using ma- (i.e., factoroids) of a selection of factors. chine learning based on manually classified sen- Using an argument model to assist in represent- tences from the Araucaria corpus, including court ing cases for conceptual legal information retrieval reports, they achieved good performance on clas- was explored in (Dick and Hirst, 1991). More re- sifying sentences as propositions in arguments or cently, other researchers have addressed automatic not and classifying argumentative propositions as semantic processing of case decision texts for le- premises or conclusions. Given a limited set of gal IR, achieving some success in automatically: documents, their manually-constructed rule-based argument grammar also generated argument tree • assigning rhetorical roles to case sentences structures (Mochales and Moens, 2011). based on 200 manually annotated Indian de- In identifying argumentative propositions, cisions (Saravanan and Ravindran, 2010), Mochales and Moens achieved accuracies of 73% and 80% on two corpora, employing domain- • categorizing legal cases by abstract West- general features (including, e.g., each word, pairs law categories (e.g., bankruptcy, finance and of words, pairs and triples of successive words, banking) (Thompson, 2001) or general top- parts of speech including adverbs, verbs, modal ics (e.g., exceptional services pension, retire- auxiliaries, punctuation, keywords indicating ment) (Gonçalves and Quaresma, 2005), argumentation, parse tree depth and number of subclauses, and certain text statistics.) For classi- • extracting treatment history (e.g., “affirmed”, fying argumentative propositions as premises or “reversed in part”) (Jackson et al., 2003), conclusions, their features included the sentence’s length and position in the document, tense and • determining the role of a sentence in the legal type of main verb, previous and successive case (e.g., as describing the applicable law or sentences’ categories, a preprocessing classifi- the facts) (Hachey and Grover, 2006), cation as argumentative or not, and the type of rhetorical patterns occurring in the sentence and • extracting offenses raised and legal principles surrounding sentences (i.e., Support, Against, applied from criminal cases to generate sum- Conclusion, Other or None). Additional features, maries (Uyttendaele et al., 1998), more particular to the legal domain included whether the sentence referred to or defined a legal • extracting case holdings (McCarty, 2007), article, the presence of certain argumentative and patterns (e.g. “see”, “mutatis mutandis”, “having reached this conclusion”, “by a majority”) and • extracting argument schemes from the Arau- whether the agent of the sentence is the plaintiff, caria corpus such as argument from example the defendant, the court or other (Mochales and and argument from cause to effect (Feng and Moens, 2011). Hirst, 2011). We aim to develop and evaluate an integrated (7) “constructed a demo speech with top claim approach using both semantic and pragmatic (con- predictions”, and textual) information to retrieve arguments from le- (8) was then “ready to deliver!” gal texts in order to improve legal information re- Figure 1 shows an argument diagram con- trieval. We are working with an underlying ar- structed manually from the video recording of De- gumentation model and its schemes, the Default bater’s oral output for the example topic. Logic Framework (DLF), and a corpus of U.S. Federal Claims Court cases (Walker et al., 2011; 3 Key Elements of Legal Argument Walker et al., 2014; Ashley and Walker, 2013a). Like (Mochales and Moens, 2011) and (Sergeant, Debater’s argument regarding banning violent 2013), we plan to: video games is meaningful but compare it to the legal argument concerning a similar topic in Fig- 1. Train an annotator to automatically identify ure 2. The Court in Video Software Dealers As- propositions in unseen legal case texts, soc. v. Schwarzenegger, 556 F. 3d 950 (9th Cir. 2009), addressed the issue of whether Cali- 2. Distinguish argumentative from non- fornia (CA ) Civil Code sections 1746-1746.5 (the argumentative propositions and classify them “Act”), which restrict sale or rental of “violent as premises or conclusions, video games” to minors, were unconstitutional un- der the 1st and 14th Amendments of the U.S. Con- 3. Employ rule-based or machine learning mod- stitution. The Court held the Act unconstitutional. els to construct argument trees from unseen As a presumptively invalid content-based restric- cases based on a manually annotated training tion on speech, the Act is subject to strict scrutiny corpus, but also to and the State has not demonstrated a compelling 4. Use argument trees to improve legal informa- interest. tion retrieval reflecting the uses of proposi- In particular, the Court held that CA had not tions in arguments. demonstrated a compelling government interest that “the sale of violent video games to minors Before sketching our approach for the legal should be banned.” Figure 2 shows excerpts from domain, however, we note that IBM appears to the portion of the opinion in which the Court jus- have developed more domain independent tech- tifies this conclusion. The nodes contain propo- niques for identifying propositions in documents sitions from that portion and the arcs reflect the and classifying them as premises in its Debater explicit or implied relations among those proposi- system.1 tions based on a fair reading of the text. On any topic, the Debater’s task is to “detect The callout boxes in Figure 2 highlight some relevant claims” and return its “top predictions for key features of legal argument illustrated in the pro claims and con claims.” On inputting the topic, Court’s argument: “The sale of violent videogames to minors should be banned,” for example, Debater: 1. Legal rules and concepts govern a court’s de- (1) scanned 4 million Wikipedia articles, cision of an issue. (2) returned the 10 most relevant articles, (3) scanned the 3000 sentences in those 10 arti- 2. Standards of proof govern a court’s assess- cles, ment of evidence. (4) detected those sentences that contained 3. Claims have support / attack relations. “candidate claims”, (5) “identified borders of candidate claims”, 4. Authorities are cited (e.g., cases, statutes). (6) “assessed pro and con polarity of candidate claims”, 5. Attribution information signals or affects 1 See, e.g., http://finance.yahoo.com/blogs/ judgments about belief in an argument (e.g., the-exchange/ibm-unveils-a-computer- “the State relies”). than-can-argue-181228620.html. A demo ap- pears at the 45 minute mark: http://io9.com/ibms- watson-can-now-debate-its-opponents- 6. Candidate claims in a legal document have 1571837847. different plausibility. The  sale  of  violent  videogames  to  minors  should  be  banned.   Pro:    Exposure  to  violent   Con:  On  the  other  hand,  I  would  like  to   videogames  results  in  increased   note  the  following  claims  that  oppose   physiological  arousal,  aggression-­‐ the  topic.  Violence  in  videogames  is   related  thoughts  and  feelings,  as   not  causally  linked  with  aggressive   well  as  decreased  pro-­‐social   tendencies.     behavior.   Pro:  In  addiAon  these  violent  games  or   Con:  In  addiAon,  most  children  who  play   lyrics  actually  cause  adolescents  to   violent  videogames  do  not  have   commit  acts  of  real  life  aggression.   problems   Pro:  Finally,  violent  video  games  can   Con:  Finally,  video  game  play  is  part  of  an   increase  children’s  aggression.   adolescent  boy’s  normal  social  seDng.   Figure 1: Argument Diagram of IBM Debater’s Output for Violent Video Games Topic (root node) Although the argument diagrams in Figures 1 “Special Masters” concerning whether claimants’ and 2 address nearly the same topic and share sim- compensation claims comply with the require- ilar propositions, the former obviously lacks these ments of a federal statute establishing the National features that would be important in legal argument Vaccine Injury Compensation Program. Under the (and, as argued later, important in using extracted Act, a claimant may obtain compensation if and arguments to improve legal IR). Of course, on one only if the vaccine caused the injury. level this is not surprising; the Debater argument In order to establish causation under the rule is not and does not purport to be a legal argument. of Althen v. Secr. of Health and Human Ser- On the other hand, given the possibility of ap- vices, 418 F.3d 1274 (Fed.Cir. 2005), the peti- plying Debater to legal applications and argumen- tioner must establish by a preponderance of the tation, it would seem essential that it be able to evidence that: (1) a “medical theory causally con- extract such key information. In that case, the nects” the type of vaccine with the type of injury, question is the extent to which explicit argument (2) there was a “logical sequence of cause and ef- models and argument schemes of legal reasoning fect” between the particular vaccination and the would be useful in order to assist with the extrac- particular injury, and (3) a “proximate temporal tion of the concepts, relationships, and informa- relationship” existed between the vaccination and tion enumerated above and illustrated in Figure 2. the injury. Walker’s corpus comprises all deci- sions in a 2-year period applying the Althen test of 4 Default-Logic Framework causation-in-fact (35 decision texts, 15-40 pages Vern Walker’s Default Logic Framework (DLF) per decision). In these cases, the Special Masters is an argument model plus schemes for evidence- decide which evidence is relevant to which issues based legal arguments concerning compliance of fact, evaluate the plausibility of evidence in the with legal rules. At the Research Laboratory for legal record, organize evidence and draw reason- Law, Logic and Technology (LLT Lab) at Hofs- able inferences, and make findings of fact. tra University, researchers have applied the DLF to The DLF model of a single case “integrates nu- model legal decisions by Court of Federal Claims merous units of reasoning” each “consisting of one 1.  rule  and   2.  standard   legal  concepts     of  proof   5.  a8ribu9on   info   3.  support  /   a8ack  rela9ons   6.  plausibility   Figure 2: Diagram Representing Realistic Legal Argument Involving Violent Video Games Topic conclusion and one or more immediately support- terarguments, (4) citation to the statute, 42 USC ing reasons (premises)” and employing four types 300aa-11(c)(1)(C)(ii)), and to the Althen and Shy- of connectives (min (and), max (or), evidence fac- face case authorities, (5) some attribution informa- tors, and rebut) (Walker et al., 2014). For example, tion that signals judgments about the Special Mas- Figure 3 shows an argument diagram representing ter’s belief in an argument (e.g., “Dr. Kinsbourne the excerpt of the the DLF model of the special and Dr. Kohrman agree”), and (6) four factors that master’s finding in the case of Cusati v. Secretary increase plausibility of the claim of causation. of Health and Human Services, No. 99-0492V (Office of Special Masters, United States Court 5 Legal Argument and Legal IR of Federal Claims, September 22, 2005) concern- Legal decisions contain propositions and argu- ing whether the first Althen condition for showing ments how to “prove” them. Prior cases provide causation-in-fact is satisfied. examples of how to make particular arguments in The main point is that the DLF model of a le- support of similar hypotheses and of kinds of ar- gal argument and its argument schemes represent guments that have succeeded, or failed, in the past. the above-enumerated key features of legal argu- Consider a simple query discussed in (Ashley and ment. As illustrated in the callout boxes of Figure Walker, 2013a): Q1: “MMR vaccine can cause in- 3, the model indicates: (1) the 1st Althen rule and tractable seizure disorder and death.” causation-in-fact concept that govern the decision An attorney/user in a new case where an injury of the causation issue, (2) the preponderance of ev- followed an MMR vaccination might employ this idence standard of proof governing the court’s as- query to search for cases where such propositions sessment, (3) support relations among the proposi- had been addressed. Relevant cases would add tions, the Special Master having recorded no coun- confidence that the propositions and accompany- 1.  rule  and   2.  standard   5.  aYribu7on   legal   FACTOR  [1  of  4]  :  "MMR   of  proof   info   vaccine  causes  fever."   concepts     AND  [1  of  2]  :  The  injury  of   Dr.  Kinsbourne  and  Dr.   Eric  Fernandez  "was  [or   Kohrman  agree  that   were]  caused  by"  the  MMR   MMR  vaccine  causes   vaccine  received  in  the   fever.   vaccina=on  on  November   5,  1996  (42  USC   300aa-­‐11(c)(1)(C)(ii)).       FACTOR  [2  of   Q1   4]  :"[F]ever  causes   AND  [1  of  3]  :  (1)   seizures."  Dr.   "MMR  vaccine   Kinsbourne  and  Dr.   A  “medical   causes  fever"   Kohrman  agree  that   OR  [2  of  2]  :  OFF-­‐TABLE   theory  causally   and  "fever   fever  causes  seizures.   INJURY:  The  "causa=on-­‐ connect[s]”  the   causes   in-­‐fact"  condi=on  is   vaccina7on  on   seizures."  "Ms.   sa=sfied  (Althen,  418  F. 11-­‐5-­‐96  and  an   Cusa7  has   3d  at  1278,  1281).     intractable   provided  more   seizure  disorder   FACTOR  [3  of  4]  :"[A]   than   and  death   child  who  suffers  a   preponderant   (Althen,  418  F.3d   complex  febrile  seizure   evidence".     at  1278).     has  a  greater  chance  of   developing  epilepsy.”   the  MMR  vaccine  was  "not  only  a   but-­‐for  cause"  of  an  intractable   6.  plausibility   3.  support  /   seizure  disorder  and  death,  "but   FACTOR  [4  of  4]  :  "[T]he   also  a  substan=al  factor  in   medical  literature  ...   aYack  rela7ons   bringing  about"  an  intractable   do[es]  not  assist  the   (no  aYacks  here)   seizure  disorder  and  death   4.  cita7on  of   special  master  in   (Shyface,  165  F.3d  at  1352-­‐53;   authori7es   evalua7ng  Ms.  Cusa7's   Althen,  418  F.3d  at  1278).     'legal  cause'  claim."   Figure 3: Diagram of DLF Model of Special Master’s Finding in Cusati Case re 1st Althen Condition ing arguments were reasonable and had been suc- client sustained seizures after receiving the MMR cessful. vaccine probably knows that he/she will have to Importantly, the cases retrieved will be more satisfy a requirement of causation. The attorney relevant to the extent that the proposition is used in may not know, however, what legal standard de- a similar argument. That is, they will be more rel- fines the relevant concept of causation or what evant to the extent that the proposition plays roles legal authority may be cited as an authoritative in the case arguments similar to the role in which source of the standard. In that situation, retrieved the attorney intends to use it in an argument about cases will likely be more relevant to the extent that the current case. that they fill in the legal rule-oriented direction, relative to a proposition similar to the one marked An argument diagram like that of Figure 3 can “Q1”, with legal rules about the concept of causa- illustrate the effect of the six key elements of le- tion and citations to their authoritative sources. gal reasoning illustrated above on how relevant a retrieved case is to a user’s query. The diagram If the attorney is unsure of the kinds of evidence shows a legal argument in which the proposition that an advocate should employ in convincing a corresponding to Q1 plays a role in the Cusati case Special Master to make the finding of fact on cau- as an evidence-based finding of the Special Mas- sation or of the relevant standard of proof for as- ter, namely, that “MMR vaccine causes fever” and sessing that evidence of causation, retrieved cases “fever causes seizures.” will be more relevant to the extent that they fill in Such diagrams have a “legal rule-oriented” di- the evidentiary factors-oriented direction, relative rection (i.e., to the left in Figure 3) and an “eviden- to a proposition similar to the one marked “Q1”, tiary factors-oriented” direction (i.e., to the right with evidentiary factors and an identification of in this diagram). For instance, an attorney whose the standard of proof. The attorney may be interested in better un- Evidence: sentences that describe any type of derstanding how to improve the plausibility of a evidence legally produced in the particular proposition about causation as an evidence-based case being litigated, as part of the proof in- finding. Cases will be more relevant to the extent tended to persuade the trier-of-fact of alleged that they contain evidentiary factors that support facts material to the case (e.g., oral testimony such a finding. An attorney interested in attack- of witnesses, including experts on technical ing the plausibility of the evidence-based finding matters; documents, public records, deposi- might be especially interested in seeing cases in- tions; objects and photographs) volving examples of evidentiary factors that attack such a finding. Citation: sentences that credit and refer to au- Finally, the cases will be more relevant to thoritative documents and sources (e.g., court the extent that the proposition similar to the one decisions (cases), statutes, regulations, gov- marked “Q1” concerning MMR vaccine’s causing ernment documents, treaties, scholarly writ- injury is attributable to the Special Master as op- ing, evidentiary documents) posed merely to some expert witness’s statement. In the “text”, “concept”, and “citation” slots of 6 Specifying/Determining Propositions’ the appropriate nodes of the query input diagram, Argument Roles Figure 4, users could specify the propositions, concepts, or citations that they know or assume The importance of a proposition’s argument role and check the targeted nodes in the directions in matching retrieved cases to users’ queries raises (rule-oriented or evidentiary-factors-oriented) or two questions: (1) How does the user specify the ranges that they hope to fill through searching for target propositions and their argumentative roles cases whose texts satisfy the diagram’s argument- in which he is interested? (2) How does a pro- related constraints. In effect, the diagram will gram determine the roles that propositions play in guide the IR system in ranking the retrieved cases retrieved case arguments? for relevance and in highlighting their relevant An argument diagram like that of Figure 3 may parts. play a role in enabling users to specify the argu- Regarding the second question, concerning how ments and propositions in which they are inter- a program will determine propositions’ argument ested. One can imagine a user’s inputting a query roles in case texts, that is the third task that by employing a more abstract version of such a di- Mochales and Moens addressed with a rule-based agram. For instance, in the Query Input Diagram grammar applied to a small set of documents. of Figure 4, the nodes are labeled with, or refer to, While their rules employed some features partic- argument roles. These roles include: ular to legal argument, (e.g., whether a sentence referred to a legal article) one imagines that ad- Legal Rule: sentences that state a legal rule in the ditional features would be needed, pertaining to abstract, without applying the rule to the par- legal argument or to the regulated domain of in- ticular case being litigated terest. These features would become the predi- cates of additional grammar rules or be annotated Ruling/Holding: sentences that apply a legal rule in training cases for purposes of machine learning. to decide issues presented in the particular The legal argument roles listed above are a first case being litigated cut at a more comprehensive enumeration of the Evidence-Based Finding: sentences that report types of legal argument features with which to an- a trier-of-fact’s ultimate findings regarding notate legal case texts in an Unstructured Infor- facts material to the particular case being lit- mation Management Architecture (UIMA) anno- igated tation pipeline for purposes of extracting argument information and improving legal IR. Evidence-Based Reasoning: sentences that re- UIMA, an open-source Apache framework, has port the trier-of-fact’s reasoning in assessing been deployed in several large-scale government- the relevant evidence and reaching findings sponsored and commercial text processing appli- regarding facts material to the particular case cations, most notably, IBM’s Watson question an- being litigated (e.g., evidentiary factors) swering system (Epstein et al., 2012). A UIMA ✔   Evidence-­‐ ✔   ✔   ✔   Based  Finding   Evidence-­‐   Based   Primary  Legal   Secondary   Ruling/Holding     Evidence   Rules   Legal  Rules     text:  “MMR  vaccine   Reasoning           can  cause  in-­‐       concepts:  causa/on   concepts:  causa/on   tractable  seizure   concepts:  causa/on   cita,ons:   cita,ons:   disorder  and     concepts:   cita,ons:   concepts:         death.”   concepts:  causa/on     Figure 4: Sample Query Input Diagram pipeline is an assemblage of integrated text anno- source credibility to resolve evidentiary dis- tators. The annotators are “a scalable set of coop- crepancies (e.g., in terms of expert vs. expert erating software programs, . . . , which assign se- or of adequacy of explanation) (Walker et al., mantics to some region of text” (Ferrucci, 2012), 2014) . and “analyze text and produce annotations or as- sertions about the text” (Ferrucci et al., 2010, p. If we succeed in designing a system of coordi- 74). nated legal annotation types and operationalizing A coordinated type system serves as the basis a UIMA annotation pipeline, we envision adding of communication among these annotators; a type a module to a full-text legal IR system. At re- system embodies a formalization of the annota- trieval time it would extract semantic / pragmatic tors’ analysis input and output data (Epstein et al., legal information from the top n cases returned by 2012, p. 3). In (Ashley and Walker, 2013b) and a traditional IR search and re-rank returned cases (Ashley and Walker, 2013a) the authors elaborate to reflect the user’s diagrammatically specified ar- three additional bases for annotations, which, with gument need. The module would also summa- further refinement, may serve as a conceptual sub- rize highly ranked cases and highlight argument- strate for the annotation types listed above: related information (Ashley and Walker, 2013a). Since the module processes the texts of cases re- 1. DLF annotations, as suggested in Figure 3, turned by the information retrieval system, no spe- capture “(i) the applicable statutory and reg- cial knowledge representation of the cases in the ulatory requirements as a tree of authoritative IR system database is required; the knowledge rule conditions (i.e., a “rule tree”) and (ii) the representation bottleneck will have been circum- chains of reasoning in the legal decision that vented. connect evidentiary assertions to the special master’s findings of fact on those rule condi- 7 Conclusion tions (Walker et al., 2011).” According to Wittgenstein, meaning lies in the 2. Annotations in terms of presuppositional in- way knowledge is used. Legal argument models formation that “identifies entities (e.g., types and argument schemes can specify roles for legal of vaccines or injuries), events (e.g., date of propositions to play (and, interestingly, Stephen vaccination or onset of symptoms) and re- Toulmin was a student of Wittgenstein.) Thus, re- lations among them used in vaccine deci- searchers can enable machines to search for and sions to state testimony about causation, as- use legal knowledge intelligently in order, among sessments of probative value, and findings of other things, to improve legal information re- fact.” (Ashley and Walker, 2013a). trieval. 3. Annotations of of argument patterns based Although IBM Debater may identify argu- on: inference type (e.g., deductive or statisti- ment propositions (e.g., claims), legal argument cal), evidence type (e.g., legal precedent, pol- schemes could help it to address legal rules and icy, fact testimony), or type of weighing of concepts, standards of proof, internal support and attack relations, citation of statutory and case au- 2010. Building Watson: An overview of the thorities, attribution, and plausibility. Open ques- DeepQA project. AI Magazine, 31(3):59–79. tions include the extent to which legal expert D. Ferrucci. 2012. Introduction to ”This is Watson”. knowledge will be needed in order to operational- IBM J. Res. and Dev., 56(3.4):1–1. ize argument schemes to extract arguments from T. Gonçalves and P. Quaresma. 2005. Is linguistic legal case texts. information relevant for the classification of legal texts? In Proc. 10th Int’l Conf. on AI and Law, Acknowledgments ICAIL ’05, pages 168–176, NY, NY. ACM. My colleagues Vern Walker, Matthias Grabmair, B. Hachey and C. Grover. 2006. Extractive summari- and Eric Nyberg make this work possible. sation of legal texts. Artificial Intelligence and Law, 14(4):305–345. P. Jackson, K. Al-Kofahi, A. Tyrrell, and A. Vachher. References 2003. Information extraction from case law and re- K. Ashley and S. Brüninghaus. 2009. Automatically trieval of prior cases. Artificial Intelligence, 150(1- classifying case texts and predicting outcomes. Ar- 2):239–290, November. tificial Intelligence and Law, pages 125–165. L.T. McCarty. 2007. Deep semantic interpretations of K. Ashley and V. Walker. 2013a. From information legal texts. In Proc. 11th Int’l Conf. on AI and Law, retrieval (IR) to argument retrieval (AR) for legal ICAIL ’07, pages 217–224, NY, NY. ACM. cases: Report on a baseline study. In K. Ashley, R. Mochales and M.-F. Moens. 2011. Argumentation editor, JURIX, volume 259 of Frontiers in Artifi- mining. Artificial Intelligence and Law, 19(1):1–22. cial Intelligence and Applications, pages 29–38. IOS Press. H. Prakken. 2005. AI & Law, logic and argument schemes. Argumentation, 19(3):303–320. K. Ashley and V. Walker. 2013b. Toward constructing evidence-based legal arguments using legal decision M. Saravanan and B. Ravindran. 2010. Identification documents and machine learning. In Proc. 14th Int’l of rhetorical roles for segmentation and summariza- Conf. on Artificial Intelligence and Law, ICAIL ’13, tion of a legal judgment. Artificial Intelligence and pages 176–180, New York, NY, USA. ACM. Law, 18(1):45–76. S. Beck. 2014. Emerging technology shapes future A. Sergeant. 2013. Automatic argumentation extrac- of law. http://www.americanlawyer. tion. In et al. P. Cimiano, editor, ESWC, volume com/id=1202664266769/Emerging- 7882 of Lecture Notes in Computer Science, pages Technology-Shapes-Future-of-Law. 656–660. Springer. Accessed: 2014-09-20. P. Thompson. 2001. Automatic categorization of case J. Daniels and E. Rissland. 1997. Finding legally rel- law. In Proc. 8th Int’l Conf. on AI and Law, ICAIL evant passages in case opinions. In ICAIL, pages ’01, pages 70–77, NY, NY. ACM. 39–46. C. Uyttendaele, M.-F. Moens, and J. Dumortier. 1998. J. Dick and G. Hirst. 1991. A case-based represen- Salomon: Automatic abstracting of legal cases for tation of legal text for conceptual retrieval. In Pro- effective access to court decisions. Artificial Intelli- ceedings, Workshop on Language and Information gence and Law, 6(1):59–79. Processing, American Society for Information Sci- ence, pages 93–102. V. Walker, N. Carie, C. DeWitt, and E. Lesh. 2011. A framework for the extraction and modeling of EA Epstein, MI Schor, BS Iyer, A. Lally, EW Brown, fact-finding reasoning from legal decisions: Lessons and J. Cwiklik. 2012. Making Watson fast. IBM J. from the vaccine/injury project corpus. Artificial In- Res. and Dev., 56(3.4):15–1. telligence and Law, pages 291–331. J. Fan, A. Kalyanpur, DC Gondek, and DA Ferrucci. V. Walker, K. Vazirova, and C. Sanford. 2014. Anno- 2012. Automatic knowledge extraction from docu- tating patterns of reasoning about medical theories ments. IBM J. Res. and Dev., 56(3.4):5–1. of causation in vaccine cases: Toward a type system for arguments. In Proc. 1st Workshop on Argumen- V. Feng and G. Hirst. 2011. Classifying arguments tation Mining, ACL 2014. by scheme. In Dekang Lin, Yuji Matsumoto, and Rada Mihalcea, editors, ACL, pages 987–996. The A. Wyner and W. Peters. 2010. Lexical semantics Association for Computer Linguistics. and expert legal knowledge towards the identifica- tion of legal case factors. In Proc. 23d Conf. on D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, Legal Knowledge and Information Systems: JURIX D. Gondek, A. Kalyanpur, A. Lally, J. W. Murdock, 2010, pages 127–136, Amsterdam. IOS Press. E. Nyberg, J. Prager, N. Schlaefer, and C. Welty.