<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Predictive Features of Persuasive Legal Texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Karl BRANTING</string-name>
          <email>lbranting@mitre.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elizabeth TIPPETT</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Charlotte ALEXANDER</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sam BAYER</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul MORAWSKI</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carlos BALHANA</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Craig PFEIFER</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>The MITRE Corporation</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>McLean VA</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Georgia State University</institution>
          ,
          <addr-line>Atlanta, GA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Oregon</institution>
          ,
          <addr-line>Eugene, OR</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>This paper explores the degree to which variations in the citation patterns, style, and textual content of legal briefs on motions for summary judgment are predictive of rulings on those motions. In an empirical evaluation on a corpus of briefs in support of motions for summary judgment, the most predictive features were several novel graph metrics, including the characteristics of a brief's twohop neighborhood in a bipartite document/precedent citation graph, and the vertex degree in an Implicit Citation Graph. These results indicate that strong and weak briefs differ systematically in citation patterns. Eight stylistic features were also identified as associated with success (a ruling consistent with the brief), and nine other features were found to have no correlation. Finally, prediction based on text features, such as ngram models, was found to be only weakly predictive of success.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Recent research has shown that decisions set forth in published opinions can often be
predicted by machine-learning models trained on the texts of the statements of facts in
those opinions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], suggesting the feasibility of automating key portions of the
development of systems for decision support, judicial caseload management, and many
other applications of legal prediction. However, critics of this work have observed that
the statements of fact in published opinions are typically highly selective summaries
of the original case record, tailored for consistency with the decision. More realistic
scenarios would base legal predictions on case records in the form on which adjudicators
themselves actually base their judgments.
      </p>
      <p>One such scenario is legal prediction based on attorneys’ persuasive writings, such
as legal briefs. Attorneys present fact patterns within the context of relevant case law,
and then make legal arguments in favor of the outcome most favorable to their client.
Decision prediction based on attorneys’ briefs may therefore be a closer approximation
of the actual task faced by courts and other adjudicators than prediction based just on fact
statements.</p>
      <p>While it is unrealistic to expect machine-learning model trained on briefs and
decisions to fully evaluate the merits of legal claims, there may nevertheless be many
characteristics of briefs that make them more or less persuasive to judges. Identifying features
that affect the persuasiveness of briefs could be useful for attorneys (by providing
feedback to improve litigation effectiveness), political scientists and jurisprudential scholars
(to increase understanding of legal processes), litigants (to improve understanding of
factors needed for success), and courts (to identify briefs for triage, benchmarking across
judges, etc.).</p>
      <p>This paper explores the degree to which variations in the citation patterns, style, and
textual content of legal briefs are predictive of subsequent rulings. To reduce confounding
factors, we focus on a single subject matter area—federal employment law—and a single
procedural setting—motions for summary judgment. A corpus of summary judgment
briefs and decisions is described in Section 2. Several novel applications of graph analysis
to corpora of briefs are set forth in Section 3, showing that strong and weak briefs differ
systematically in citation patterns. Section 4 shows that stylistic factors in briefs, while
less predictive citation patterns, nevertheless have a measurable influence of rulings, and
Section 5 shows naive approaches to ruling prediction based on the text of briefs are only
weakly predictive. The implications and suggestions for future research are set forth in
Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The Summary Judgment Corpus</title>
      <p>
        The Summary Judgment Corpus (Corpus) consists of a random sample of 864 federal
employment cases involving summary judgment motions in the years 2007-2018. The
cases were drawn from the PACER2 system, via Bloomberg dockets, and are limited to
those having one of the following Nature of Suit codes: “Civil Rights – employment,”
“Labor – Fair Labor Standards Act,” and “Labor – Family and Medical Leave.” The
experiments described here were limited to 444 cases that include at least an initial brief
and an opposition brief (including reply and surreply briefs, if any) and for which the
motion for summary judgment was either granted in full or denied in full (thus
finessing the complexities of motions granted in part and denied in part). A team of law
students downloaded the briefs and opinions, reviewed each opinion, and coded the result
of the ruling. In 98% of the cases, the defendant/employer filed the motion for summary
judgment.3 There is significant class skew in the decisions, with about 76% of motions
for summary judgment being granted,4 so in our experiments we evaluate accuracy
using Matthews Correlation Coefficient (MCC) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] since it is generally a more informative
measure of predictive performance on skewed data sets than alternative measures.
How2https://pacer.uscourts.gov/
3Plaintiffs (employees) make up a somewhat higher proportion of movants overall, but they are
overrepresented among cases where the judge partially granted the motion in part and denied it in part. Due to the
procedural posture of motions for summary judgment, it is nearly impossible for the plaintiff in a case to win
an entire case on summary judgment.
      </p>
      <p>4This figure is not representative of the success of summary judgment motions overall, as it excludes motions
that were granted in part and denied in part.
ever, we include the more familiar F-measure as well. The class skew means that
classification based on the majority rule (i.e., if movant, predict “win”; if respondent, predict
“lose”) achieves an MCC of 0.481 and a frequency-weighted F-measure of 0.740.</p>
      <p>We distinguish two perspectives on formalizing decisions as instances for machine
learning:</p>
      <p>Brief-oriented. In this perspective, each instance consists of a set of features
derived from the briefs filed by a single party, together with the ruling on the motion
that the briefs addressed. The ruling on the motion is labeled as a “win” or “lose”
based on whether the brief supported the party who filed the summary judgment
motion (the movant) or the party who opposes the motion (the respondent). For
example, if the court grants the motion for summary judgment, the movant’s brief
would be labeled a “win” and the respondent’s a “lose.” Conversely, if the court
denies a motion for summary judgment, the respondent’s brief in opposition to
the motion would be labeled a “win,” and the brief by the movant a “lose.”
Case-oriented. In this perspective, each instance consists of the features derived
from both the initial brief and the opposition brief, and the label consists of
“granted” or “denied.” In this approach, it is generally necessary to tag each
feature to distinguish whether it came from the movant’s or the respondent’s brief.
In the discussion below the predicted “outcome” is either “win/lose,” in brief-oriented
experiments or “grant/deny” in case-oriented experiments. All results were calculated in
10-fold cross validation.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Prediction from Citations</title>
      <p>In persuasive legal writing, lawyers cite cases to support particular legal propositions in
their briefs. For example, a defendant/employer might assert that resigning is legally
distinct from being fired, and then cite a case in which a court distinguished resigning from
being fired.5 This suggests that citations can be viewed as proxies for legal arguments
and that the relative effectiveness of alternative arguments can be estimated by measuring
the relative degree of association between the citations representing those arguments and
outcomes. We therefore performed a series of experiments involving outcome prediction
from citations. We used a modification of the the CourtListener6 citation finder code to
identify all citation spans in our corpus.</p>
      <sec id="sec-3-1">
        <title>3.1. Citation Frequency Vectors</title>
        <p>The first hypothesis that we tested was that outcomes could be predicted using a
straightforward representation of briefs as citation frequency vectors, i.e., as features in which
each value represents the number of times that a particular precedent was cited in a given
brief. We created a brief-oriented data set in which each movant and respondent brief was
5See e.g. Iovanella v. Genentech, Inc., Case 2:09-cv-01024, Defendant Genentech, Inc’s Memorandum of
Law in Support of its Motion for Summary Judgment, Document 37-1 (April 9, 2010) (case analyzed in corpus).</p>
        <p>6https://www.courtlistener.com/
Features</p>
        <p>MCC
represented by a sparse vector of 15,826 unique integer citation features. As shown in the
first row of Table 1, the predictive accuracy based on this representation is relatively low,
although greater than chance. This weak performance may be unsurprising given that the
number of features is an order of magnitude greater than the number of instances.</p>
        <p>The hypothesis that data sparsity contributed to low prediction performance
suggested an experiment in which the feature set was reduced to the 100 highest
information gain (IG) citations. As set forth in the second row of Table 1, this feature reduction
boosted MCC to 0.401. Inspection of the citations with the highest IG indicated that they
were generally proxies for fatal factual defects in a plaintiff’s case. Each one essentially
justified dismissing cases that fall into a certain fact pattern. Citation frequency vectors
are more predictive for the simpler tasks of predicting whether a brief is by the plaintiff
or defendant (MCC = 0.420, F1 = 0.690) and of predicting whether the brief is by the
movant or the respondent (MCC = 0.450, F1 = 0.694).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Citation Comparisons</title>
        <p>A second hypothesis is that it is the relationships among the citations by the parties and
the court, rather than the individual citations themselves, that are predictive of outcomes.
Perhaps overlap in citations between a court’s ruling and a party’s brief might be a proxy
for the relative strength of the arguments in that brief. For example, a judge who is
persuaded by the arguments in a brief might choose to incorporate the citations in support
of those arguments.</p>
        <p>To test this hypothesis we calculated the cosine between the citation frequency
vectors of three pairs: (1) movant and respondent; (2) movant and the court; and (3)
respondent and the court. We evaluated grant/deny accuracy on the resulting case-oriented data
set in which each case is represented by these three cosine values. As shown in Table 2,
predictive performance was weak, suggesting that simple citation vector similarity is not
a good proxy for argument strength.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Graph Analysis</title>
        <p>
          There has been extensive research on graphs derived from collections of homogeneous
documents linked by embedded citations, such as Supreme Court decisions [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] and codes
of statutes or regulations [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. This approach to graph modeling is not directly applicable
to the Corpus, in which the documents that contain citations (briefs) differ from those that
are cited (precedents). We therefore experimented with two novel graph representations:
a bipartite citation graph, consisting of both document (brief or decision) and precedent
nodes linked by “has-citation” edges; and an implicit citation graph, in consisting of
briefs that are connected if they share a common citation. We implemented both graphs
in Neo4j.7
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.3.1. Bipartite Graph Experiments</title>
        <p>Figure 1 shows a fragment of a bipartite graph representation of the Corpus in which each
node represents a document (brief or decision) or a precedent and each edge connects a
document to a precedent that it cites. The label on each brief node is an internal index
followed by “-M,” “-R,” or “-C,” denoting movant documents, respondent documents, or
court documents, respectively.</p>
        <p>The bipartite graph representation is conducive to analytics that exploit locality in
citation space, that is, that compare briefs in terms of similar citation behavior. For each
brief in our graph, we derived the following features:
1. prob2HopWin. For a given brief of type “R” or “M”, the percentage of all
twohop brief nodes of the same type whose win/lose value is “win.”
2. numLevel0Cites. The number of a brief’s citations (i.e., vertex degree in the
bipartite graph).
3. avgLevel1Cites. The mean number of citations to each precedent that the brief
cites (i.e., mean vertex degree of a brief’s 1-hop neighbors).
4. numSharedCites: The number of citations shared with the brief of the opposing
party (i.e., the number of 2-hop paths between a brief and the opposing side’s
brief).</p>
        <p>Movant</p>
        <p>Respondent</p>
        <p>All</p>
        <p>5. avgCiteWinScore. The mean citation win score of each cited precedent, where the
citation win score is the percentage of briefs of the same type (“R” or “M”) citing
that precedent with value “win.” The root brief is excluded from this calculation.
6. prob2HopGrant and avgCiteGrantScore: like prob2HopWin and
avgCiteWin</p>
        <p>Score but based on “grant” rather than “win.”</p>
        <p>Table 3 shows the information gain from each of the graph features in win prediction
for movant, respondent, and all briefs, and Table 4 shows win prediction performance on
the highest information gain features for each of these 3 sets. It may be unintuitive that
the most predictive features for all briefs differ from those for the movant and respondent
briefs individually. However, as mentioned above, outcome prediction based on the party
alone is sufficient for an MCC of 0.481. Evidently, the difference in citation behavior
between movants and respondents is captured by the graph features with high information
gain for the full set. However, prediction for movant and respondent briefs individually
is conditioned on already knowing the value of the party, so graph features indicative of
party status are less informative. The predictive performance for the movant and
respondent briefs individually, shown at the bottom of Table 4 reflects the residual information
from graph features after party status is factored out.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.3.2. Implicit Citation Graph Experiments</title>
        <p>Our second set of graph experiments involve a graph, derived from the bipartite graph,
in which nodes consist of documents, and edges connect pairs of documents that cite a
precedent in common. We refer to this representation as an implicit citation graph (ICG).
A portion of this graph is shown in Figure 2.</p>
        <p>Our first observation was that the ICG had 16 connected components, 15
consisting of singletons (a single brief each), and the remaining component containing all other
briefs. The singletons were all unsuccessful, i.e., had value “lose.” Intuitively, this
indicates that citing only precedents that are cited by no one else is a very poor strategy. This
observation suggests the hypothesis that there is a correlation between vertex degree in
the ICG (i.e., the number of precedents a brief cites that are cited in some other brief) and
the likelihood of success, and indeed we observed a correlation of 0.1278 between vertex
degree and the win/lose decision represented as 0 or 1. This correlation can be visualized
in a graph of the 50-element moving average of 0/1 values as a function of vertex degree,
as shown in Figure 3.</p>
        <p>This correlation is intuitive; citing idiosyncratic cases may be a sign that the lawyer
lacks familiarity with relevant case law, or that their legal argument is so far-fetched that
they must cite to remote cases. By contrast, lawyers who can draw parallels to a large
corpus of precedent, either due to their expertise or the underlying merits of the case, are
likely to fare better.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Stylistic Features</title>
      <p>
        The prose style of briefs has been shown empirically to affect judges’ assessments of
persuasiveness and credibility [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We therefore measured the predictive effect (from a
brief-oriented perspective) of a number of stylistic features that have been suggested as
having a possible effect on judges’ rulings:
1. doc count - the total number of documents filed by the party, i.e., 0, 1, or 2,
depending on whether the party files an opposition, reply, or surreply
2. citation count - the number of citations in the brief
3. hedging - words associated with trying to explain away bad facts or bad
arguments, e.g., “even assuming,” “albeit”
4. hyperbolic language - use of terms like “blatant,” “absurd,” “egregious,” etc.
5. sentence count - the number of sentences in the brief
6. legal amplifiers - use of terms like “conclusory,” “inadequate,” “irrelevant” etc..
7. repetition - terms such as “additional,” “again”
8. total string cites - the number of instances of multiple consecutive citations
9. mean string cite length - the average number of citations in each string cite
10. jurisdiction - the district in which the case was brought
11. mean sentence length in tokens
12. negative emotional state - terms indicating negative affect, e.g., “upset,” “cried”
13. jury request - whether a jury was requested
14. nature of suit - as specified in the PACER system
15. court - the court in which the case was brought
16. pro se - whether the party was self-represented
17. cause of action - as specified in the PACER system
Collectively these stylistic features are modestly predictive of a party’s likelihood of
success: MCC = 0.389 and F1 = 0.693 are achieved with the following rules (induced by
jRip8):
This rule indicates that the best predictive performance can be obtained using just features
1, 2, 6, and 9 above.
      </p>
      <p>We calculated the mutual information between each of the features and the win/lose
decision. Features 10 through 17 in the list had negligible mutual information with
outcomes, meaning that we were unable to detect any meaningful effect from these features.</p>
      <p>8https://weka.sourceforge.io/doc.dev/weka/classifiers/rules/JRip.html
The information gain from features 1 through 8 varied from 0.117 (doc count) to 0.031
(mean string cite length). We conclude that features 1–8 are meaningful and should be
considered by attorneys when drafting briefs. However, it may be that these features are
most significant in boundary cases, e.g., failing to file a reply or surreply may be an
indication of lack of conscientiousness; too few citations may indicate lack of effort; lengthy
string cites and hyperbolic language may indicate an attempt to compensate for a weak
case or poor writing.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Text-Based Prediction</title>
      <p>
        As mentioned in the Introduction, prior research has shown that text classification
techniques sometimes perform well in predicting outcomes from judges’ statements of the
case. In contrast, recent work suggests that the techniques perform very poorly when
applied to fact statements written by laypersons [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. We hypothesized that applying these
techniques to the text of briefs might perform at an intermediate level of accuracy.
      </p>
      <p>We performed two experiments to test this hypothesis. Both experiments followed a
case-oriented paradigm in which the decision was predicted from text produced by both
the movant and the respondent.</p>
      <p>In the first experiment, all text was extracted from the briefs submitted by each party
and normalized by conversion removal of punctuation, numbers, and stop words. The
text was converted to terms as described below, and a flag denoting the type of the party
prepended to each term in a brief by that party, e.g., “M-” is prepended to each term from
a movant brief, and “R-” is prepended to each term from a respondent brief. This
convention permitted terms from one party to be distinguished from terms by the other. We
experimented with three term representations: (1) binary (one-hot) 1–3 gram vectors, (2)
1–3 gram frequency vectors, and (3) lower-cased unigram frequency vectors. The
macroaveraged MCC grant/deny prediction using these representations was 0.253, 0.123, and
0.136, respectively, indicating relatively weak predictively value.</p>
      <p>The second experiment applied this procedure just to the parenthesized text that
precedes citations, e.g.,
(employee cannot establish pretext by asserting unsupported blanket denial) Irvin v.</p>
      <p>Airco Carbide, 837 F.2d 724 (6th Cir. 1987)
Typically, such a parenthetical succinctly expresses the proposition for which the
precedent is being cited. One might surmise that, collectively, such parentheticals constitute
the main arguments of a brief and that a model trained on these texts would, in effect,
learn the relative effectiveness of these arguments. However, the macro-averaged for
parenthetical using the 3 representations listed above were 0.032, 0.040, and 0.032,
respectively, indicating predictive accuracy essentially equal to chance.</p>
      <p>Briefs contain complex arguments and detailed references to the factual record, and
a court’s decision on a motion depends on both the nuances of legal and factual
arguments set forth in briefs and on factors outside of the scope of the briefs themselves, such
as the evolution of legal doctrine. It may be unsurprising that simple document
classification techniques produce weak prediction results when applied to documents with such
complex discourse structure.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This paper has shown that both the pattern of citations and various stylistic properties of
briefs are associated with the likelihood of success of the underlying motion. Correlation
is not causation, so it is not clear which of these factors are independent variables that
could controlled by a litigant to improve the odds of success and which are the common
consequence of factors that also control the outcome. Nevertheless, the results provide
some insight into the influence of lawyering and legal writing on outcomes. The
predictiveness of 2-hop neighborhoods in a bipartite graph is evidence that successful lawyers
on each side tend to cite to a common set of cases, whereas less-successful lawyers tend
to make idiosyncratic citations and to use hyperbolic language and other potential smoke
screens for weaknesses in their case, their research, or their writing.</p>
      <p>Distinguishing causal from merely correlated factors is the work of future analysis.
However, the research described in this paper illustrates how machine-learning and graph
analysis can be used to identify factors for further investigation, including distinguishing
significant from insignificant stylistic variations and detecting latent graph characteristics
that reveal hitherto unsuspected relationships between citations and decisions.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The MITRE Corporation is a not-for-profit company, chartered in the public interest.
This document is approved for Public Release; Distribution Unlimited. Case Number
20-2944. c 2020 The MITRE Corporation. All rights reserved.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Chalkidis</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androutsopoulos</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aletras</surname>
            <given-names>N. Neural</given-names>
          </string-name>
          <string-name>
            <surname>Legal</surname>
          </string-name>
          Judgment Prediction in English.
          <source>CoRR</source>
          .
          <year>2019</year>
          ; abs/
          <year>1906</year>
          .
          <year>02059</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Sulea</surname>
            <given-names>O</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zampieri</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vela</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>van Genabith</surname>
            <given-names>J. Predicting</given-names>
          </string-name>
          <article-title>the Law Area and Decisions of French Supreme Court Cases</article-title>
          . In: RANLP. INCOMA Ltd.;
          <year>2017</year>
          . p.
          <fpage>716</fpage>
          -
          <lpage>722</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Branting</surname>
            <given-names>LK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yeh</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Merkhofer</surname>
            <given-names>EM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Inducing Predictive</surname>
          </string-name>
          <article-title>Models for Decision Support in Administrative Adjudication</article-title>
          . In:
          <article-title>AI Approaches to the Complexity of Legal Systems -</article-title>
          AICOL
          <source>International Workshops 2015-2017, Revised Selected Papers</source>
          . vol.
          <volume>10791</volume>
          of Lecture Notes in Computer Science. Springer;
          <year>2017</year>
          . p.
          <fpage>465</fpage>
          -
          <lpage>477</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Boughorbel</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jarray</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>El-Anbari</surname>
            <given-names>M</given-names>
          </string-name>
          .
          <article-title>Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric</article-title>
          . ”
          <source>PLoS ONE”</source>
          .
          <year>2017</year>
          ;
          <volume>12</volume>
          (
          <issue>6</issue>
          ). E0177678.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Carmichael</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wudel</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jushchuk</surname>
            <given-names>J</given-names>
          </string-name>
          .
          <article-title>Examining the Evolution of Legal Precedent Through Citation Network Analysis</article-title>
          .
          <source>North Carolina Law Review</source>
          .
          <year>2017</year>
          December;
          <volume>96</volume>
          (
          <issue>1</issue>
          ):
          <fpage>227</fpage>
          -
          <lpage>269</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Whalen</surname>
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Legal</surname>
          </string-name>
          <article-title>Networks: The Promises and Challenges of Legal Network Analysis</article-title>
          .
          <source>Michigan State Law Review</source>
          .
          <year>2016</year>
          ;
          <volume>2</volume>
          :
          <fpage>539</fpage>
          -
          <lpage>566</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Benson</surname>
            <given-names>RW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kessler</surname>
            <given-names>JB</given-names>
          </string-name>
          .
          <article-title>Legalese v. plain English: an empirical study of persuasion and credibility in appellate brief writing</article-title>
          .
          <source>Loy LAL Rev</source>
          .
          <year>1986</year>
          ;
          <volume>20</volume>
          :
          <fpage>301</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Branting</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhana</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfeifer</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aberdeen</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
            <given-names>B</given-names>
          </string-name>
          .
          <article-title>Judges are from Mars, Pro Se Litigants are from Venus: Predicting Decisions from Lay Texts</article-title>
          .
          <source>In: Legal Knowledge and Information Systems - JURIX</source>
          <year>2020</year>
          :
          <article-title>The Thirty-</article-title>
          Third Annual Conference;
          <year>2020</year>
          . To appear.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>