<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Segmentation of Rulemaking Documents for Public Notice-and-Comment Process Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anna Belova∗</string-name>
          <email>abelova@alumni.cmu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthias Grabmair</string-name>
          <email>mgrabmai@andrew.cmu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Nyberg</string-name>
          <email>ehn@cs.cmu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Carnegie Mellon University</institution>
          ,
          <addr-line>Pittsburgh, PA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <abstract>
        <p>We evaluate feasibility of automated identification of comment discussion passages and comment-driven proposed rule revisions in the US Environmental Protection Agency's (EPA's) rulemaking documents. We have annotated a dataset of final rule documents to identify all spans in which EPA discusses and evaluates the merits of public comments received on its proposed rules, and present lessons learned from the annotation process. We implement several baseline supervised discourse segmentation models that combine classic linear learners with sentence representations using handcrafted features as well as Bidirectional Encoder Representations from Transformers (BERT). We observe good agreement on annotation comment discussions and our models achieve a classification F1 of 0.73. Public comment dismissals and rule revisions are substantially harder to annotate and predict, leading to lower agreement and model performance. Our work contributes a dataset and a baseline for a novel discourse segmentation task of identifying public comment discussion and evaluation by the receiving agency.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Government agencies are created by the legislatures worldwide to
regulate social, economic, and political aspects of people’s lives.
These agencies belong to the executive branch of the government,
yet they create legally enforceable regulations and rules that
implement broad legislation. In the US, public notice-and-comment
processes have become an important venue for influencing social
and economic policy. In that, US agencies publish proposed rules
in the Federal Register (FR) and all interested parties are given
an opportunity to comment. Agency regulatory proposals receive
public feedback from individuals, businesses, organized groups (of
individuals or businesses), and other agencies. Comments represent
heterogeneous interests in particular regulatory outcomes. The
agency is not obliged to react to each individual received comment.
However, it has to respond to comments that raise significant issues
with the proposed rule and, if the points raised have merit, may
substantively revise of the rulemaking document. The final rule
document is published in the FR and contains the discussion of
submitted comments, or points to other documents in the docket
that address concerns raised in the comments.
∗Corresponding author</p>
      <p>The online forum for the US public notice-and-comment process—
regulations.gov—was launched in January 2003, as part of the US
eRulemaking program established as a cross-agency E-Gov
initiative under Section 206 of the 2002 E-Government Act (H.R. 2458/S.
803). In this collection, all documents pertaining to the
development of a particular rule are compiled in a regulatory docket. A
typical docket contains a proposed rule document, many public
comment documents, and a final/revised rule document. 1 As such,
regulations.gov provides a testbed for study of the public
noticeand-comment discourse in the US.</p>
      <p>In this work, we focus on (1) identifying spans in the final rule
documents that contain the agency’s discussion of the public
comments it received, and (2) classifying those spans as being either
dismissals of the commenter claims or revisions of the proposed
regulations prompted by the comment. In that, we analyze 353 US
Environmental Protection Agency (EPA) regulations proposed in
January 2003 or later, and finalized as of March 2018. 2</p>
      <p>
        Our work contributes a dataset3 and a baseline for a novel
discourse segmentation task of identifying public comment discussion
and evaluation by the receiving agency. Automatic detection of
comment discussion passages in the rulemaking documents could
improve the eficiency of regulatory review conducted by experts at
a number of organizations, including the US Ofice of Information
and Regulatory Afairs, regulatory agencies, and other stakeholders
of the regulatory process. In addition, segmentation of regulatory
discourse is the first step bringing agency’s narrative deliberations
in the study of bureaucratic politics and decision making (e.g.,
regulatory capture theory) by economists and political scientists [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ],
which to date has relied on structured data generated by surveys
and administrative record-keeping (e.g. permitting, inspections).
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        In the peer-reviewed literature, discussion of e-rulemaking benefits,
challenges, and related artificial intelligence (AI) methods began
in the early 2000s [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Over a decade later, surveys by [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]
describe several e-rulemaking initiatives that involved successful
applications of AI. One line of e-rulemaking research has focused
on tasks relevant to management of massive amount of public
comments received by agencies (e.g., [
        <xref ref-type="bibr" rid="ref58">58</xref>
        ], [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], [
        <xref ref-type="bibr" rid="ref52">52</xref>
        ]). Another line of
1Other documents, such as transcripts of public hearings, technical support documents,
detailed comment response documents, copies of pertinent scientific papers, e-mails
and other correspondence, may also be included. Finally, a docket may also contain
tabular data and software source code used to produce analytical results.
2We have chosen to focus on EPA because this agency published the most rules (∼ 20%
of all rule documents) and received the most comment submissions (∼ 10% of all
comment documents) in regulations.gov during the studied time period.
3The data and code are available at
https://github.com/mug31416/PubAdminDiscourse.git
research, conducted as part of Cornell University’s RegulationRoom
project, has focused on tools to improve the quality of public
discourse around rulemaking (e.g., [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ], [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ]). Research on the text
of rules developed by agencies has mostly focused on the search
for similar rules in the FR [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ], rather than segmentation of the
comment-related discourse in the rule documents.
      </p>
      <p>
        Prior to launch of regulations.gov, work on e-rulemaking
used several rule-specific comment collections that were either
shared by the agencies—EPA, Fish and Wildlife Service (FWS)—or
gathered as part of the RegulationRoom experiments in
collaboration with the US Department of Transportation (DOT). The tasks
have included near duplicate detection to address mass comment
campaigns [
        <xref ref-type="bibr" rid="ref58">58</xref>
        ], comment topic modeling [
        <xref ref-type="bibr" rid="ref30 ref5 ref51 ref59 ref8">5, 8, 30, 51, 59</xref>
        ],
stakeholder attitude identification [
        <xref ref-type="bibr" rid="ref1 ref31">1, 31</xref>
        ], and presence of substantive
points in public comments [
        <xref ref-type="bibr" rid="ref2 ref44 ref45 ref57">2, 44, 45, 57</xref>
        ]. The RegulationRoom
project has generated a number or papers on argument mining
and conflict detection within comments [
        <xref ref-type="bibr" rid="ref29 ref34 ref43">29, 34, 43</xref>
        ]. These research
eforts have focused on examining only a few regulatory
proceedings at a time, whereas we evaluate a signifcantly larger dataset
containing hundreds of rule documents.
      </p>
      <p>
        More recent work on e-regulation has analyzed public comment
data collected by regulations.gov [
        <xref ref-type="bibr" rid="ref13 ref14 ref35 ref37 ref50 ref52">13, 14, 35, 37, 50, 52</xref>
        ],
rulespecific data from the Canadian government [
        <xref ref-type="bibr" rid="ref53">53</xref>
        ], and data from the
White House e-petition platform [
        <xref ref-type="bibr" rid="ref15 ref19 ref20 ref21">15, 19–21</xref>
        ]. The tasks addressed
in this body of work are topic modeling [
        <xref ref-type="bibr" rid="ref15 ref20 ref21 ref35 ref37 ref52 ref53">15, 20, 21, 35, 37, 52, 53</xref>
        ],
sentiment analysis [
        <xref ref-type="bibr" rid="ref13 ref14 ref37 ref50">13, 14, 37, 50</xref>
        ], named entity recognition [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ],
and social network analysis [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        Segmentation of text into discourse units [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] is a core natural
language task. Many downstream tasks, such as information
extraction [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], sentiment analysis [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], information retrieval [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], and
summarization [
        <xref ref-type="bibr" rid="ref36 ref4">4, 36</xref>
        ], can benefit from discourse segmentation.
Because lexical and syntactic text properties form important
discourse clues [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], many segmentation methods rely on hand-crafted
features to capture them [
        <xref ref-type="bibr" rid="ref17 ref26">17, 26</xref>
        ]. Classic learning frameworks that
have been used for discourse segmentation are linear Support
Vector Machines (SVM) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and linear-chain Conditional Random
Fields (CRF) [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ].
      </p>
      <p>
        One of the key challenges in discourse segmentation
development is the dearth of annotated data, which, until recently,
prevented the use of neural architectures. Efective neural discourse
segmentation methods [
        <xref ref-type="bibr" rid="ref22 ref56">22, 56</xref>
        ] have relied on word representations
obtained from an external neural model trained to perform a
related task using a large corpus [
        <xref ref-type="bibr" rid="ref39 ref49">39, 49</xref>
        ]. The state-of-the art neural
discourse segmentation framework [
        <xref ref-type="bibr" rid="ref18 ref56">18, 56</xref>
        ] has employed a
Bidirectional Long-Short-Term Memory-CRF architecture (BiLSTM-CRF)
[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] with an attention mechanism [
        <xref ref-type="bibr" rid="ref55">55</xref>
        ].
      </p>
      <p>
        For our baseline model development, we have combined several
classic learning methods with hand-crafted, as well as neural
sentence representations, from Bidirectional Encoder Representations
from Transformers (BERT) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], which were trained on English
Wikipedia (2,500 million words) and BooksCorpus (800 million
words) [
        <xref ref-type="bibr" rid="ref60">60</xref>
        ] using masked language and next sentence prediction
objectives. BERT representations have demonstrated to perform
well on a wide range of natural language processing tasks. We also
explore whether fine-tuning of BERT on the unlabeled documents
in our corpus improves performance.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>DATA</title>
    </sec>
    <sec id="sec-4">
      <title>Rule-Making Documents</title>
      <p>We work with the EPA’s final rule documents that are part of the
FR. Along with a summary, each of our documents can contain one
or more of the following sections: regulatory background, scope of
the regulation, rationale for action, technical material describing
the regulatory requirements, responses to public comments on the
proposed regulation, statutory and executive order review, and
legal references. We are interested in automated identification of all
passages where the agency discusses public comments, which could
occur throughout the document and are not necessarily confined
to the comment response section.</p>
      <p>We note that the structure of the final rule documents can vary
significantly depending on whether it has been produced by the
EPA headquarters or a regional ofice, as well as depending on the
specific EPA ofice (e.g., Ofice of Water, Ofice of Air and Radiation).
For example, rule documents produced by the headquarters ofices
are usually major federal regulations that tend to be long and receive
significant public feedback. On the other hand, rule documents
produced by regional ofices tend to be shorter. 4</p>
      <p>It should be noted that our dataset only contains final rule
documents as published in the FR. It does not include submitted comment
documents, technical support documents, or detailed, dedicated
comment response documents that are part of the docket but
extraneous to the register.
3.1.1 Task 1: Detecting Comment Discussions. In the first task, we
want to identify the spans in the document where the EPA discusses
submitted public comments. Examples of a comment discussion
include:
• Descriptions of comments received by the agency. For
example, “EPA received comments suggesting that the definition
of clean alternative fuel conversion should be limited to a
group of fuels with proven emission benefits.”;
• Descriptions of the agency’s responses to the comments it
receives. For example, “ EPA believes however that the public
interest is better served by a broader definition that allows for
future introduction of innovative and as-yet unknown fuel
conversion systems. EPA is therefore finalizing the proposed
definition of clean alternative fuel conversion...”.</p>
      <p>By distinction, we are not interested in:
• Summarized feedback from petitions (as opposed to public
comments) to the agency;
• Descriptions of the public comments on another rule;
• Statements such as “we received no comments”;
• Passages discussing revisions of a regulatory standard rather
than revisions of the proposed rule;
• Referrals to another document in the docket with detailed
responses to comments.
3.1.2 Task 2: Classification of Comment Merit. In the second task,
we want to classify each comment discussion span as to whether
the discussed comment prompted a change in the final rule from
the proposed rule. As such, we are considering three categories:
4With the possible exception of the regional air quality rules that still tend to attract
considerable public attention
passages in which the agency indicates a revision of the rule based
on a public comment, passages in which the agency dismisses
a comment, and neutral comment discussion passages (i.e., the
passages in which the agency neither dismisses the comment nor
indicates a revision).</p>
      <p>Examples of formulations reflecting comment-based regulatory
change are rule revisions and rule withdrawals:
• “To address concerns about space limitations, EPA will allow
the label information to be logically split between two labels
that are both placed as close as possible to the original Vehicle
Emission Control Information (VECI) or engine label.”
• “EPA agrees and is including use of this procedure in the OBD
demonstration requirement for intermediate age vehicles.”
• “The EPA has reviewed the new data submitted by the
commenter and used these data to determine the revised MACT
lfoor for continuous process vents at existing sources.”
• “EPA received one adverse comment from a single
Commenter on the aforementioned rule. As a result of the
comment received, EPA is withdrawing the direct final rule
approving the aforementioned changes to the Alabama SIPs.”
Examples of comment dismissals without a subsequent
regulatory change are:
• “We disagree that our action to approve California’s mobile
source regulations that have been waived or authorized by
the EPA under CAA section 209 is inconsistent with the
Ninth Circuit’s decision...”
• “EPA is finalizing the conversion manufacturer definition as
proposed.”
• “While we agree with the commenter that pressure release
from a PRD constitutes a violation, we will address this in a
separate rulemaking...”
• “In the final rule we will clarify our position...”
• “EPA appreciates support from the commenters for this
initiative and agrees that the rule makes it possible for EPA to
process the TRI data more quickly.”
• “EPA believes that no further response to the comment is
necessary...”</p>
      <p>We observe that this task requires considerably more complex
inference, potentially spanning multiple sections of the document.
As seen in the examples above, comment dismissals range from
very obvious to rather subtle. In turn, determinations of whether
a rule was materially revised based on the public comments may
also require a clear understanding of what was proposed in the first
place.</p>
      <p>An extreme example of this can be seen from the following
comment dismissal sentence:</p>
      <p>“Certain aspects of good engineering judgment described in the
exhaust control system, evaporate control system, and fuel delivery
control system sections may be approached diferently than described
above, but EPA expects that test data demonstrating compliance is
required rather than optional in such cases.”</p>
      <p>The sentence responds to technical objections to a regulation
by conceding that alternatives are valid (“may be approached
differently”) but goes on to state the substantive decision in domain
terminology (“compliance is required rather than optional”,
suggesting that the comment had advocated for the “optional” alternative).
Without context, it is unclear whether this sentence has anything
to do with comments at all, let alone whether required vs. optional
compliance results in it agreeing with, or dismissing, the comment’s
arguments.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Acquisition and Sampling</title>
      <p>We have created our corpus from regulations.gov data by
selecting EPA regulatory dockets for rules proposed in January 2003
or later and finalized as of March 2018. Our selection has been
constrained to dockets containing at least one proposed rule
document, at least one final rule document, and at least one comment
document. Our corpus contains 1,566 EPA dockets (meta-data 8.8
MB), 2,645 final rule documents (HTML, 376 MB), 2,531 proposed
rule documents (HTML, 400 MB), and 282,655 comment documents
(85% PDF, 36 GB; 15% plain text, 836 MB).</p>
      <p>For the purposes of exhaustive rule document annotation, we
have used stratified random sampling at the docket level to select
two development docket sets (dev1 and dev2) and one test docket
set. The sampling procedure has ensured that the docket sets are a
representative mix of EPA program ofices and regions. 5 As such,
we have obtained 75 dev1-set dockets (116 documents), 76 dev2-set
dockets (136 documents), and 73 test-set dockets (99 documents).</p>
      <p>In our qualitative examination of the regulatory documents, we
have found that the section headers of the rule documents are
often informative about whether a section contains a discussion of
public comments. To make use of this additional information, we
have applied the same random sampling procedure to the remaining
dockets to obtain 211 training dockets (817 training documents) and
103 validation dockets (197 validation documents) for the section
header annotation.
3.3</p>
    </sec>
    <sec id="sec-6">
      <title>Preprocessing</title>
      <p>
        The rule documents were processed in two steps. First, we have
applied a rule-based rule document parsing procedure to delete
tables, split the text into sections, and retrieve section titles of
the first and second level super-sections. This procedure exploits
the regular structure of documents to create heuristics applicable
to roughly 90% of documents.6 When exceptions to the standard
structure are detected, we manually fixed irregularities to enable
automatic parsing. Second, the section text has been split into
sentences, tokenized, and lemmatized using SpaCy [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]7.
3.4
      </p>
    </sec>
    <sec id="sec-7">
      <title>Annotation</title>
      <p>3.4.1 Rule Documents. We hired ten students from Carnegie
Mellon University and the University of Pittsburgh to perform the
annotation tasks during the period of February 2019–April 2019.
All annotators are at least second year undergraduate students. Five
of the annotators are masters students in fields including computer
science, public health, product management, and international
relations. The other five are undergraduate students in civil engineering,
creative writing, business, and human computer interaction.
5For example, Ofice of Water/Headquarters, Ofice of Air and Radiation/Region 1 –
Boston.
6For example, the first and the second level sections are numbered consecutively in
Roman numbers and Latin letters, respectively.
7Version 2.0.18 (model en_core_web_sm)</p>
      <p>The annotators were trained to perform the two tasks described
in Section 3.1.1 and Section 3.1.2. For the first task, each annotator
received an hour-long in-person training as well as individualized
feedback on a set of four training documents. For the second task,
the guidelines were delivered via a video. Each annotator received
50 documents on average, including reliability annotations. The
documents were allocated such that each annotator worked on a
balanced mix of documents from diferent EPA ofices, regions, and
dev1/dev2/test set dockets. The annotations were performed using
an online tool developed by a collaborating group at the University
of Pittsburgh called Gloss.</p>
      <p>Finally, we note that some annotators did not complete all
assignments for the segmentation task, leading to some redistribution
of work. The comment response classification task was completed
by eight annotators of the initial ten annotators.
3.4.2 Section Headers. Annotation of the section headers was
performed by a sole expert annotator (the first author). To this end,
all unique section titles were extracted along with three samples
of the first paragraph following the section title. These examples
are used to judge whether a section contains comment discussion:
If all three sample paragraphs include comment discussions, the
section title is flagged as the comment-discussion-indicative title. 8</p>
    </sec>
    <sec id="sec-8">
      <title>4 METHODS</title>
      <p>
        To generate baseline results, we use a classic linear SVM9 and
linear-chain CRF10 learners to segment the rule documents into
spans that contain public comment discussion and merit
evaluation by the agency.11 The benefit of the CRF over the SVM is
that, when predicting a sentence label, it takes into account the
label of the prior and subsequent sentence in addition to the focal
sentence’s feature vector. In addition, to understand the impact of
incorporating feature interactions, we conduct experiments with
the Multi-Layer-Perceptron (MLP)[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].12
      </p>
      <p>
        We estimate three binary sentence-level models predicting whether
a given sentence contains: (i) a public comment discussion, (ii) a
dismissal of a public comment by the agency, and (iii) an agency
decision to revise the proposed rule based on the public comments.
For the CRF modeling, a training instance is a sequence of
sentences within the rule document section boundaries. To address the
label sparsity for the comment dismissal/revision classification, we
explore the utility of training models only on data that is known to
contain comment discussion (i.e. on the non-ignorable sentences)
and then composing a two tiered model to first detect comment
discussions, and then then classify their polarity. The
hyperparameters have been tuned by fitting the models to the dev1-set and
evaluating results on the dev2-set.
8For example, there were several first level section titles “What comments did EPA
receive?”.
9We use scikit-learn version 0.20.2 SVC implementation [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ] with an error term penalty
parameter of 1, and 1,500 as the maximum number of iterations.
10We use PyStruct 0.3.2 implementation [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ] of margin re-scaled structural SVM
using the 1-slack formulation and cutting plane method [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. We used regularization
parameter of 0.1 and 1,500 as the maximum number of iterations.
11We have been unable to fit kernelized polynomial and RBF SVMs to our data because
these methods do not scale well to the size of our dataset.
12We use a scikit-learn version 0.20.2 MLP implementation [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ] with one hidden layer
of 100 units optimized for at most 100 epochs at the default settings.
      </p>
    </sec>
    <sec id="sec-9">
      <title>4.1 Handcrafted Features</title>
      <p>
        For sentence representation we concatenate three categories of
handcrafted features. First, we featurized the text of the sentence
for which the prediction needs to be made, as well as the text of the
preceding sentence, and concatenate the feature vectors. We use
original tokens (including stop words, but excluding punctuation),
modified tokens with attached POS tags, bigrams of modified tokens,
and bigrams of POS tags.13 We apply feature hashing [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ] to reduce
dimensionality. This results in a feature set of size 2,001.
      </p>
      <p>Second, we featurized the text of the section header containing
the sentence in question. In that, we apply the same feature
generation process used for sentences to the text of the sentence-bearing
section header and the header that precedes it. The dimension of
this feature set is 101.</p>
      <p>Third, we also add a binary flag equal to one if a header of the
section in which the sentence occurs has been predicted to
contain a comment discussion. We generate these predictions through
instance-based learning on the unique section headers from the
training set of dockets set aside for this purpose (see Section 3.4.2).
Based on the unique headers from the associated validation docket
set, this signal mining procedure has a recall of 0.54 and a precision
of 0.88.</p>
    </sec>
    <sec id="sec-10">
      <title>4.2 Neural Features</title>
      <p>
        We employ BERT[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to create embedded vector representations
for sentences and section headers. BERT is a state of the art neural
network language model trained on a large collection of English text
in a quasi-unsupervised fashion by having it learn to predict masked
words in a sentence, or to classify whether one sentence follows
another, or not. By doing so, BERT learns to maintain a neural
representation of language context. These vector representations
of English text can then be used as for various natural language
processing tasks and have been shown to yield significantly better
performance than context-independent word embeddings.
      </p>
      <p>
        As in case of the hand-crafted features, we concatenate both the
vectors of the sentence/header in question as well as the context
represented by the preceding sentence/header to form a final
feature vector. We explore performance of the available pretrained
BERT model as well as a BERT model that has been fine-tuned on
approximately 6,000 rule documents from our corpus that have
not been included in the annotated document sets. To this end, we
rely on a PyTorch[
        <xref ref-type="bibr" rid="ref47">47</xref>
        ] implementation of BERT.14 The size of the
generated sentence/header embedding is 728. The fine-tuned model
was trained for seven epochs.
      </p>
    </sec>
    <sec id="sec-11">
      <title>5 EVALUATION</title>
      <p>
        We evaluate the quality of the rule document annotation using
Cohen’s kappa coeficient [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], as well as qualitatively. Performance of
our baseline text segmentation models is evaluated on the test set at
the sentence level using area under the ROC curve (AUC), F1-score,
precision, and recall. We found a sentence to be the most
meaningful operational definition of a passage, because comment-discussing
13We do not use a TFIDF feature representation because it has not performed as well
as a simple count-based featurizer in our preliminary experiments.
14PyTorch Pretrained BERT: The Big and Extending Repository of pretrained
Transformers from https://github.com/huggingface/pytorch-pretrained-BERT. We used the
bert-base-uncased version of the model.
sentences are often interspersed with ignorable sentences of a
section or a paragraph. For each model, the classification cutof has
been determined using a threshold that maximizes the F1-score on
the training data.
Table 1 summarizes the key properties of the annotated dataset.
For this summary, we have converted span-level annotations into
sentence-level annotations. To this end, we have assigned a label to
a sentence if an annotator has marked 80% of tokens that make up
that sentence. For documents that have been annotated by multiple
individuals, we assign a label to a sentence if at least one individual
has labeled the sentence. This approach has been motivated by a
qualitative examination of annotations, which revealed low recall
issues for some annotators. Depending on the dataset, non-ignorable
content (i.e. text labeled as discussing comments) comprises 21%
to 33% of all sentences, comment dismissals comprise 4% to 5%
of all sentences, and comment-based revisions comprise 2% to 3%
of all sentences. Approximately half of all labeled sentences have
been annotated by two individuals. Due to the annotator attrition,
reliability annotations for a more refined labeling task (i.e.,
identification of comment dismissals and comment-based rule revisions)
are available for 73% to 79% of all double-annotated sentences.
      </p>
      <p>Table 1 also reports the inter-annotator agreement statistics,
while Table 2 summarizes agreement with the expert annotator
on four final rule documents used as part of the annotator
training. (Expert annotations have been produced by the first author,
who has 10 years of professional experience in supporting EPA’s
regulatory proposal development.) For the non-ignorable content,
inter-annotator agreement scores range from 0.38 to 0.67
(depending on the dataset), whereas agreement with the expert is 0.74 on
average (range: 0.35–0.95). We note that agreement on this task
appears to improve from the dev1 set to the test set, which may reflect
that the annotators learned to do the task better over time, given the
order in which the documents have been assigned. Inter-annotator
agreement for the comment dismissal labeling task ranges from
0.18 to 0.32, while agreement on the comment-based rule revisions
is very low, ranging between 0.086 and 0.19. Agreement with the
expert on these tasks is also low: 0.33 (range: 0–0.54) for the
comment dismissals and 0.38 (range: 0–0.75) for the comment-based
rule revisions.</p>
      <p>We have reviewed the annotator errors vis-a-vis the expert
annotator. False negatives tend to occur most commonly when:
• The annotator captures only the initial part of the
comment discussion that contains typical lexical cues (e.g., “EPA
received comments suggesting...”, “Commenters noted...”,
“EPA agrees with the commenters...”) but fails to include
the entire—usually technical—comment discussion that can
span multiple subsequent paragraphs;
• A passage with comment discussion is “buried” in the middle
of a longer paragraph, as often happens when comments are
discussed in the background section;
• For the more dificult annotation task of identifying
commentbased rule revisions and comment dismissals, we have noted
that false negatives tend to occur when the evaluation of the
Dev1set</p>
      <p>Dev2set</p>
      <p>Testset
• EPA regulations are typically incremental, in that they often
tend to modify older, preexisting rules. Therefore, the final
and proposal rule document discuss changes/ revisions of
the prior regulatory standard. This has been a significant
source of confusion for the annotators, who found it
dificult to separate comment-based revisions of the proposed
regulation from the revisions of the regulatory standard on
the regulatory agenda, leading to false positives.</p>
      <p>Prec.
0.164
• Another challenge for the annotators has been the decision
of when the discussion switches from comment-related to
the general topics, also leading to false positives.
• Specifically for the comment-based rule revisions, some
annotators found it challenging to distinguish between
revisions of the proposed rule that were based on comments
from revisions that occurred for other reasons. For example,
the EPA may implement revisions based on new evidence
that emerges after the proposed rule is submitted for public
review.
6.2</p>
    </sec>
    <sec id="sec-12">
      <title>Classification Results</title>
      <p>Table 3 and Table 4 show the test set evaluation performance results
for each binary classification task divided by learning framework
and feature set. The models have produced better than random
predictions, with largest AUC of 0.937 noted for the non-ignorable
content prediction and smallest AUC of 0.677 noted for the
commentbased rule change prediction. These patterns largely reflect the
diferences in the quality of annotations obtained for our prediction
tasks, with the segmentation task being significantly easier than
the comment response classification task.</p>
      <p>For the non-ignorable content prediction, the models produce
recall in the range of 0.636–0.708 and precision in the range of
0.688–0.798. Unsurprisingly, for the more complex annotation tasks
with low annotator agreement, classification quality is poor. For the
comment dismissal prediction, recall is 0.085–0.537 and precision is
0.091–0.249, whereas for the comment-based rule change prediction,
recall is 0.065–0.490 and precision is 0.056–0.189.
6.2.1 Linear Model Analysis. CRF model results do not appear to
be materially diferent from those generated by the SVM model
on the same handcrafted feature set, even through they take into
account the labels of neighboring sentences. We note, however,
that the CRF models have produced consistently higher precision
scores, compared to the SVM models estimated on the same feature
set. Because we experienced some convergence problems with CRF
models, we have fit them to only one feature set.</p>
      <p>Table 3 also shows that neural BERT features on average tend to
generate higher AUC, precision, and recall. We note that the
twotiered models perform better for the comment dismissal prediction,
but not for the comment-based revision prediction. In the latter case,
the gains in precision are minor and do not ofset the significant
losses in recall.</p>
      <p>We also observe that neural features based on the fine-tuned
BERT can perform better than those using out-of-the-box BERT
(e.g. best AUC and precision on non-ignorable content prediction).
Interestingly, combining neural and handcrafted feature sets
generally does not produce synergy performance increases, which could
be due to the substantial increase in the overall feature dimension,
or the lack of feature interaction capacity in linear models.
6.2.2 Multi-Layer Perceptron Results. In a second set of
experiments we assessed whether classification performance increases
with models that allow for feature interactions. To this end, we
trained a series of Multi-Layer-Perceptron models (i.e. a neural
network with one hidden layer of size 100 and a two-class softmaxed
output) on our tasks and feature sets. Table 4 contains the results we
Comment-based Regulatory Change
CRF+HCF n.a. 0.088 0.091 0.085
SVM+HCF 0.677 0.092 0.056 0.273
SVM+BERT (as is) 0.802 0.126 0.074 0.420
SVM+HCF+BERT (as is) 0.736 0.099 0.058 0.335
SVM+BERT (tuned) 0.815 0.125 0.077 0.337
SVM+HCF+BERT (tuned) 0.754 0.091 0.051 0.446
2-SVM+HCF 0.724 0.081 0.091 0.073
2-SVM+BERT (as is) 0.796 0.104 0.112 0.097
2-SVM+HCF+BERT (as is) 0.745 0.078 0.075 0.081
2-SVM+BERT (tuned) 0.808 0.086 0.128 0.065
2-SVM+HCF+BERT (tuned) 0.744 0.108 0.088 0.138
Notes: Random – predictions are draws from a Bernoulli distribution with probability
set to the target class prior. Semi-Random – predictions are generated by first applying
the best-performing non-ignorable content classifier and then drawing from a
Bernoulli distribution with probability set to the target class conditional prior. 2-SVM
– a two-tiered SVM model. HCF – hand crafted features. AUC – area under the ROC
curve. CRF model does not produce confidence scores, hence AUC estimation was not
possible. The classification cutof was chosen to maximize F1 score for each model.
obtained on for the MLP with an identity transformation (MLP-Id)
before the final softmax. 15</p>
      <p>We observe that nonlinear models using BERT features can
achieve somewhat higher AUC and F1 scores than the linear models
shown in Table 3. We also see that adding handcrafted features to
15We have also obtained results for the MLP with an a Rectified Linear Unit (ReLU)
activation function before the final softmax MLP-ReLU. The practical diference is that
a ReLU activation will truncate all incoming negative activation values to 0 and leave
positive ones unchanged. We do not report these results because they were largely
inferior to those obtained for the MLP-Id variant.
a model can occasionally yield some performance synergy. From
this we infer that nonlinear models could potentially produce
better results on our dataset, and hence we plan to experiment with
recurrent or dilated convolutional models for sequence tagging to
leverage the document context in future work.
2-MLP-Id+HCF 0.757 0.078 0.093 0.068
2-MLP-Id+BERT (as is) 0.723 0.092 0.077 0.114
2-MLP-Id+HCF+BERT (as is) 0.770 0.092 0.112 0.078
2-MLP-Id+BERT (tuned) 0.766 0.123 0.189 0.091
2-MLP-Id+HCF+BERT (tuned) 0.789 0.130 0.113 0.154
Notes: Random – predictions are draws from a Bernoulli distribution with probability
set to the target class prior. Semi-Random – predictions are generated by first
applying the best-performing non-ignorable content classifier and then drawing from
a Bernoulli distribution with probability set to the target class conditional prior.
2-MLP – a two-tiered MLP model. HCF – hand crafted features. AUC – area under the
ROC curve. MLP-Id – a multi-layer perceptron with one hidden layer with 100 units
and an identity non-linearity followed by a Softmax; this model is equivalent to a
generalized linear regression model with interaction terms. The classification cutof
was chosen to maximize F1 score for each model.
6.2.3 Error Analysis. For our best-performing models we have
generated and examined five random examples for each type of
error. Our findings are as follows:</p>
      <p>False Positives: The models tend to produce false positives when
sentences contain certain trigger words (such as “response”,
“revision”, “finalizing the rule as proposed”) yet the overall context
of the passage is not related to the discussion of public comments.
For example, these trigger words have been observed in passages
discussing petitions and revisions of the regulatory standard that
are not based on comments, similar to mistakes made by human
annotators. There is also a fair share of label noise: As noted earlier,
the annotators have been challenged by longer comment
discussions and occasionally failed to capture the entire relevant span.
We also conjecture that in this case the models have been guided
by the section-header related signal.</p>
      <p>False Negatives: The false negatives tend to occur in sections that
do not commonly contain comment discussion (e.g., “Background”,
“Executive Order Review”). Sentences that lack the boilerplate
language (e.g., “response”, “EPA”, “comment”) also tend to be missed
more often. As with the false positives, we observed some amount
of label noise, often in cases when the annotators mislabeled
discussions of regulatory revisions that have not been driven by public
feedback or when annotators have failed to determine an
appropriate boundaries for the technical discussion of comments.</p>
      <p>Label Confusion: We have observed several cases of the models
being confused about the polarity of EPA assessment, particularly
when the sentence has included trigger words such as “agree” and
“disagree” together.</p>
      <p>
        Parsing: We have noted several instances of erroneous sentence
parsing (e.g., a citation “40 CFR 51.1010(b).” has been isolated as a
sentence) that lead to classification errors. This issue could be
remedied by a sentence boundary detector oriented towards processing
legal text [
        <xref ref-type="bibr" rid="ref54">54</xref>
        ].
7
      </p>
    </sec>
    <sec id="sec-13">
      <title>DISCUSSION</title>
      <p>It is likely possible to automatically identify certain type of
content in regulatory documents with irregular structure. Our baseline
segmentation performance for detecting comment discussion
sentences with recall in the range of 0.636–0.708 and precision in the
range of 0.688–0.798. While we have focused on identifying
comment discussion by the receiving agency, we believe that there are
other types of content (e.g., regulatory requirements) automated
segmentation of which may be both, desired and feasible.
Detecting specific comment discussions that either dismiss comments
or announce rule revision turns out to be a harder task for both
annotators and, consequently, for models. Moving forward, this
begs the question of which information need the model caters to. If
value is added by quickly pointing an expert to comment discussion
passages, then a well-performing model is within reach given good
training data. On the other hand, an automated analysis of topics
for which comments have been influential remains a hard problem.</p>
      <p>We also note that our dataset has been compiled using highly
educated non-expert annotators. We have found that this type of
background is suficient for producing relatively coarse
annotations (e.g., identifying parts of the document that contain comment
discussion). We have measured the annotator-expert agreement
of 0.74 for the comment discussion identification task. However,
more refined annotation tasks, such as the ones determining the
agency’s responses to public feedback, would likely require
expertlevel understanding of the domain.</p>
      <p>
        We believe that our baseline modeling results can be further
improved by developing a fully neural sequence tagging model,
such as the one developed for the standard discourse segmentation
corpus [
        <xref ref-type="bibr" rid="ref56">56</xref>
        ]. However, even with access to the sequence encoders
such as BERT, the limited size of our corpus may still present a
modeling challenge.
8
      </p>
    </sec>
    <sec id="sec-14">
      <title>CONCLUSIONS</title>
      <p>We have produced a dataset and baseline for a novel discourse
segmentation task of identifying public comment discussion and
evaluation by regulatory agencies. In doing so we presented
evidence that detecting comment discussions automatically using
mainstream NLP techniques is feasible given good training data.
Classifying discussions of a particular type is harder both because
of data sparsity and low annotator agreement. While good general
detection performance will add value in some practical settings,
we see opportunity for further improvement in the use of neural
sequence tagging models, albeit subject to the limitations of data
quality as a function of annotator expertise, training, and type
system design.
9</p>
    </sec>
    <sec id="sec-15">
      <title>ACKNOWLEDGMENTS</title>
      <p>The authors thank University of Pittsburgh Intelligent Systems
Program student Jaromir Savelka for permission to use the Gloss
annotation tool.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Jaime</given-names>
            <surname>Arguello</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jamie</given-names>
            <surname>Callan</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>A bootstrapping approach for identifying stakeholders in public-comment corpora</article-title>
          .
          <source>In Proceedings of the 8th annual international conference on Digital government research: bridging disciplines &amp; domains. Digital Government Society of North America</source>
          ,
          <fpage>92</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Jaime</given-names>
            <surname>Arguello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jamie</given-names>
            <surname>Callan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Stuart</given-names>
            <surname>Shulman</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Recognizing citations in public comments</article-title>
          .
          <source>Journal of Information Technology &amp; Politics 5</source>
          ,
          <issue>1</issue>
          (
          <year>2008</year>
          ),
          <fpage>49</fpage>
          -
          <lpage>71</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Parminder</given-names>
            <surname>Bhatia</surname>
          </string-name>
          , Yangfeng Ji, and
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Eisenstein</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Better document-level sentiment analysis from rst discourse parsing</article-title>
          .
          <source>arXiv preprint arXiv:1509.01599</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Hadi</surname>
          </string-name>
          <string-name>
            <surname>Bokaei</surname>
          </string-name>
          , Hossein Sameti, and Yang Liu.
          <year>2016</year>
          .
          <article-title>Extractive summarization of multi-party meetings through discourse segmentation</article-title>
          .
          <source>Natural Language Engineering</source>
          <volume>22</volume>
          ,
          <issue>1</issue>
          (
          <year>2016</year>
          ),
          <fpage>41</fpage>
          -
          <lpage>72</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Claire</given-names>
            <surname>Cardie</surname>
          </string-name>
          , Cynthia R Farina, Matt Rawding, and
          <string-name>
            <given-names>Adil</given-names>
            <surname>Aijaz</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>An erulemaking corpus: Identifying substantive issues in public comments</article-title>
          . (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Lynn</given-names>
            <surname>Carlson</surname>
          </string-name>
          , Daniel Marcu, and Mary Ellen Okurowski.
          <year>2003</year>
          .
          <article-title>Building a discourse-tagged corpus in the framework of rhetorical structure theory. In Current and new directions in discourse and dialogue</article-title>
          . Springer,
          <fpage>85</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Nuno</given-names>
            <surname>Carvalho</surname>
          </string-name>
          and Rui Pedro Lourenço.
          <year>2018</year>
          . E-Rulemaking:
          <article-title>Lessons from the Literature</article-title>
          .
          <source>International Journal of Technology and Human Interaction (IJTHI) 14</source>
          ,
          <issue>2</issue>
          (
          <year>2018</year>
          ),
          <fpage>35</fpage>
          -
          <lpage>53</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Lijun</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Summaritive digest for large document repositories with application to e-rulemaking</article-title>
          . (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Cary</given-names>
            <surname>Coglianese</surname>
          </string-name>
          .
          <year>2004</year>
          . E-Rulemaking:
          <article-title>Information technology and the regulatory process</article-title>
          .
          <source>Administrative Law Review</source>
          (
          <year>2004</year>
          ),
          <fpage>353</fpage>
          -
          <lpage>402</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Cohen</surname>
          </string-name>
          .
          <year>1960</year>
          .
          <article-title>A coeficient of agreement for nominal scales</article-title>
          .
          <source>Educational and psychological measurement 20</source>
          ,
          <issue>1</issue>
          (
          <year>1960</year>
          ),
          <fpage>37</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Corinna</given-names>
            <surname>Cortes</surname>
          </string-name>
          and
          <string-name>
            <given-names>Vladimir</given-names>
            <surname>Vapnik</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Support-vector networks</article-title>
          .
          <source>Machine learning 20, 3</source>
          (
          <year>1995</year>
          ),
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Jacob</surname>
            <given-names>Devlin</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kristina</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          . CoRR abs/
          <year>1810</year>
          .04805 (
          <year>2018</year>
          ). arXiv:
          <year>1810</year>
          .04805 http://arxiv.org/abs/
          <year>1810</year>
          .04805
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Tao</given-names>
            <surname>Ding</surname>
          </string-name>
          and
          <string-name>
            <given-names>Shimei</given-names>
            <surname>Pan</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>How Reliable Is Sentiment Analysis? A Multidomain Empirical Investigation</article-title>
          .
          <source>In International Conference on Web Information Systems and Technologies</source>
          . Springer,
          <fpage>37</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Lauren</surname>
            <given-names>M Dinour</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Antoinette</given-names>
            <surname>Pole</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Potato Chips, Cookies, and Candy Oh My! Public Commentary on Proposed Rules Regulating Competitive Foods</article-title>
          .
          <source>Health Education &amp; Behavior</source>
          <volume>44</volume>
          ,
          <issue>6</issue>
          (
          <year>2017</year>
          ),
          <fpage>867</fpage>
          -
          <lpage>875</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Catherine</surname>
            <given-names>Dumas</given-names>
          </string-name>
          , Teresa M Harrison,
          <string-name>
            <given-names>Loni</given-names>
            <surname>Hagen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Xiaoyi</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>What Do the People Think?: E-Petitioning and Policy Decision Making</article-title>
          . In Beyond Bureaucracy. Springer,
          <fpage>187</fpage>
          -
          <lpage>207</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Yixing</surname>
            <given-names>Fan</given-names>
          </string-name>
          , Jiafeng Guo, Yanyan Lan, Jun Xu,
          <string-name>
            <given-names>Chengxiang</given-names>
            <surname>Zhai</surname>
          </string-name>
          , and Xueqi Cheng.
          <year>2018</year>
          .
          <article-title>Modeling diverse relevance patterns in ad-hoc retrieval</article-title>
          .
          <source>In The 41st International ACM SIGIR Conference on Research &amp; Development in Information Retrieval. ACM</source>
          ,
          <volume>375</volume>
          -
          <fpage>384</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Vanessa</given-names>
            <surname>Wei</surname>
          </string-name>
          Feng and
          <string-name>
            <given-names>Graeme</given-names>
            <surname>Hirst</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Two-pass discourse segmentation with pairing and global features</article-title>
          .
          <source>arXiv preprint arXiv:1407.8215</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Elisa</surname>
            <given-names>Ferracane</given-names>
          </string-name>
          , Titan Page, Junyi Jessy Li,
          <string-name>
            <given-names>and Katrin</given-names>
            <surname>Erk</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>From News to Medical: Cross-domain Discourse Segmentation</article-title>
          . arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>06682</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Loni</surname>
            <given-names>Hagen</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teresa M Harrison</surname>
          </string-name>
          , and Catherine L Dumas.
          <year>2018</year>
          .
          <article-title>Data Analytics for Policy Informatics: The Case of E-Petitioning</article-title>
          . In Policy Analytics, Modelling, and Informatics. Springer,
          <fpage>205</fpage>
          -
          <lpage>224</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Loni</surname>
            <given-names>Hagen</given-names>
          </string-name>
          , Teresa M Harrison,
          <string-name>
            <given-names>Özlem</given-names>
            <surname>Uzuner</surname>
          </string-name>
          , Tim Fake, Dan Lamanna, and
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Kotfila</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Introducing textual analysis tools for policy informatics: a case study of e-petitions</article-title>
          .
          <source>In Proceedings of the 16th annual international conference on digital government research. ACM</source>
          ,
          <volume>10</volume>
          -
          <fpage>19</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Loni</surname>
            <given-names>Hagen</given-names>
          </string-name>
          , Özlem Uzuner, Christopher Kotfila,
          <string-name>
            <surname>Teresa M Harrison</surname>
            ,
            <given-names>and Dan</given-names>
          </string-name>
          <string-name>
            <surname>Lamanna</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Understanding Citizens' Direct Policy Suggestions to the Federal Government: A Natural Language Processing and Topic Modeling Approach</article-title>
          .
          <source>In System Sciences (HICSS)</source>
          ,
          <year>2015</year>
          48th Hawaii International Conference on. IEEE,
          <fpage>2134</fpage>
          -
          <lpage>2143</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Mehedi</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            <surname>Kotov</surname>
          </string-name>
          ,
          <string-name>
            <surname>S Naar</surname>
          </string-name>
          , GL Alexander, and
          <string-name>
            <given-names>A Idalski</given-names>
            <surname>Carcone</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Deep neural architectures for discourse segmentation in e-mail based behavioral interventions</article-title>
          . In American Medical Informatics Association (AMIA).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Geofrey</surname>
            <given-names>E</given-names>
          </string-name>
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          .
          <year>1990</year>
          .
          <article-title>Connectionist learning procedures</article-title>
          .
          <source>In Machine learning. Elsevier</source>
          ,
          <volume>555</volume>
          -
          <fpage>610</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Honnibal</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ines</given-names>
            <surname>Montani</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing</article-title>
          . To appear (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Zhiheng</surname>
            <given-names>Huang</given-names>
          </string-name>
          , Wei Xu,
          <string-name>
            <given-names>and Kai</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Bidirectional LSTM-CRF models for sequence tagging</article-title>
          .
          <source>arXiv preprint arXiv:1508</source>
          .
          <year>01991</year>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Yangfeng</given-names>
            <surname>Ji</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Eisenstein</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Representation learning for text-level discourse parsing</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , Vol.
          <volume>1</volume>
          .
          <fpage>13</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Robin</surname>
            <given-names>Jia</given-names>
          </string-name>
          , Clif Wong, and
          <string-name>
            <given-names>Hoifung</given-names>
            <surname>Poon</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Document-Level N -ary Relation Extraction with Multiscale Representation Learning</article-title>
          . arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>02347</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Thorsten</surname>
            <given-names>Joachims</given-names>
          </string-name>
          , Thomas Finley, and
          <string-name>
            <surname>Chun-Nam John Yu</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Cutting-plane training of structural SVMs</article-title>
          .
          <source>Machine Learning</source>
          <volume>77</volume>
          ,
          <issue>1</issue>
          (
          <year>2009</year>
          ),
          <fpage>27</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Barbara</surname>
            <given-names>Konat</given-names>
          </string-name>
          , John Lawrence, Joonsuk Park, Katarzyna Budzynska, and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Reed</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A Corpus of Argument Networks: Using Graph Properties to Analyse Divisive Issues.</article-title>
          .
          <source>In LREC.</source>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Namhee</surname>
            <given-names>Kwon</given-names>
          </string-name>
          , Stuart W Shulman, and
          <string-name>
            <given-names>Eduard</given-names>
            <surname>Hovy</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Multidimensional text analysis for eRulemaking</article-title>
          .
          <source>In Proceedings of the 2006 international conference on Digital government research. Digital Government Society of North America</source>
          ,
          <fpage>157</fpage>
          -
          <lpage>166</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Namhee</surname>
            <given-names>Kwon</given-names>
          </string-name>
          , Liang Zhou, Eduard Hovy, and Stuart W Shulman.
          <year>2007</year>
          .
          <article-title>Identifying and classifying subjective claims</article-title>
          .
          <source>In Proceedings of the 8th annual international conference on Digital government research: bridging disciplines &amp; domains. Digital Government Society of North America</source>
          ,
          <fpage>76</fpage>
          -
          <lpage>81</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>John</surname>
            <given-names>Laferty</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andrew McCallum</surname>
          </string-name>
          , and
          <source>Fernando CN Pereira</source>
          .
          <year>2001</year>
          .
          <article-title>Conditional random fields: Probabilistic models for segmenting and labeling sequence data</article-title>
          . (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Gloria</surname>
            <given-names>T</given-names>
          </string-name>
          <string-name>
            <surname>Lau</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>A comparative analysis framework for semi-structured documents, with applications to government regulations</article-title>
          . Stanford University.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>John</surname>
            <given-names>Lawrence</given-names>
          </string-name>
          , Joonsuk Park, Katarzyna Budzynska, Claire Cardie, Barbara Konat, and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Reed</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Using argumentative structure to interpret debates in online deliberative democracy and eRulemaking</article-title>
          .
          <source>ACM Transactions on Internet Technology (TOIT) 17</source>
          ,
          <issue>3</issue>
          (
          <year>2017</year>
          ),
          <fpage>25</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Karen</surname>
            <given-names>EC Levy</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Franklin</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Driving regulation: using topic models to examine political contention in the US trucking industry</article-title>
          .
          <source>Social Science Computer Review</source>
          <volume>32</volume>
          ,
          <issue>2</issue>
          (
          <year>2014</year>
          ),
          <fpage>182</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>Junyi</given-names>
            <surname>Jessy</surname>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kapil</given-names>
            <surname>Thadani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Amanda</given-names>
            <surname>Stent</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>The role of discourse units in near-extractive summarization</article-title>
          .
          <source>In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue</source>
          .
          <volume>137</volume>
          -
          <fpage>147</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Michael</surname>
            <given-names>A Livermore</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Vladimir</given-names>
            <surname>Eidelman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Brian</given-names>
            <surname>Grom</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Computationally assisted regulatory participation</article-title>
          .
          <source>Notre Dame L. Rev</source>
          .
          <volume>93</volume>
          (
          <year>2017</year>
          ),
          <fpage>977</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Marcu</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>The theory and practice of discourse parsing and summarization</article-title>
          . MIT press.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Ilya Sutskever, Kai Chen, Greg S Corrado, and
          <string-name>
            <given-names>Jef</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          .
          <volume>3111</volume>
          -
          <fpage>3119</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>John</surname>
            <given-names>E. Moody.</given-names>
          </string-name>
          <year>1988</year>
          .
          <article-title>Fast Learning in Multi-Resolution Hierarchies</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          <volume>1</volume>
          , [NIPS Conference, Denver, Colorado, USA,
          <year>1988</year>
          ].
          <fpage>29</fpage>
          -
          <lpage>39</lpage>
          . http://papers.nips.cc/paper/ 175-fast
          <article-title>-learning-in-multi-resolution-hierarchies</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Muhlberger</surname>
          </string-name>
          , Nick Webb, and
          <string-name>
            <surname>Jennifer</surname>
          </string-name>
          Stromer-Galley.
          <year>2008</year>
          .
          <article-title>The Deliberative E-Rulemaking project (DeER): improving federal agency rulemaking via natural language processing and citizen dialogue</article-title>
          .
          <source>In Proceedings of the 2008 international conference on Digital government research. Digital Government Society of North America</source>
          ,
          <fpage>403</fpage>
          -
          <lpage>404</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Andreas</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Müller</surname>
            and
            <given-names>Sven</given-names>
          </string-name>
          <string-name>
            <surname>Behnke</surname>
          </string-name>
          .
          <year>2014</year>
          . pystruct
          <article-title>- Learning Structured Prediction in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>15</volume>
          (
          <year>2014</year>
          ),
          <fpage>2055</fpage>
          -
          <lpage>2060</lpage>
          . http://jmlr.org/papers/v15/mueller14a.html
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>Joonsuk</given-names>
            <surname>Park</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Mining and evaluating argumentative structures in user comments in eRulemaking</article-title>
          . Cornell University.
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>Joonsuk</given-names>
            <surname>Park</surname>
          </string-name>
          , Cheryl Blake, and
          <string-name>
            <given-names>Claire</given-names>
            <surname>Cardie</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Toward machine-assisted participation in eRulemaking: An argumentation model of evaluability</article-title>
          .
          <source>In Proceedings of the 15th International Conference on Artificial Intelligence and Law</source>
          . ACM,
          <volume>206</volume>
          -
          <fpage>210</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>Joonsuk</given-names>
            <surname>Park</surname>
          </string-name>
          and
          <string-name>
            <given-names>Claire</given-names>
            <surname>Cardie</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Identifying appropriate support for propositions in online user comments</article-title>
          .
          <source>In Proceedings of the First Workshop on Argumentation Mining</source>
          .
          <fpage>29</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>Joonsuk</given-names>
            <surname>Park</surname>
          </string-name>
          , Sally Klingel, Claire Cardie, Mary Newhart, Cynthia Farina, and
          <string-name>
            <surname>Joan-Josep Vallbé</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Facilitative moderation for online participation in eRulemaking</article-title>
          .
          <source>In Proceedings of the 13th Annual International Conference on Digital Government Research</source>
          . ACM,
          <volume>173</volume>
          -
          <fpage>182</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <surname>Adam</surname>
            <given-names>Paszke</given-names>
          </string-name>
          , Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang,
          <string-name>
            <surname>Zachary</surname>
            <given-names>DeVito</given-names>
          </string-name>
          , Zeming Lin, Alban Desmaison, Luca Antiga, and
          <string-name>
            <given-names>Adam</given-names>
            <surname>Lerer</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Automatic diferentiation in PyTorch</article-title>
          . In NIPS-W.
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Duchesnay</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Scikit-learn: Machine Learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          ),
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <surname>Matthew</surname>
            <given-names>E Peters</given-names>
          </string-name>
          , Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Luke</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Deep contextualized word representations</article-title>
          .
          <source>arXiv preprint arXiv:1802</source>
          .
          <volume>05365</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <surname>Rachel</surname>
            <given-names>A</given-names>
          </string-name>
          <string-name>
            <surname>Potter</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>More than spam? Lobbying the EPA through public comment campaigns</article-title>
          .
          <source>In Brookings Series on Regulatory Process and Perspective</source>
          . https://www.brookings.edu/research/ more-than
          <article-title>-spam-lobbying-the-epa-through-public-comment-campaigns</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <surname>Stephen</surname>
            <given-names>Purpura</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Claire</given-names>
            <surname>Cardie</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jesse</given-names>
            <surname>Simons</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Active learning for e-rulemaking: Public comment categorization</article-title>
          .
          <source>In Proceedings of the 2008 international conference on Digital government research. Digital Government Society of North America</source>
          ,
          <fpage>234</fpage>
          -
          <lpage>243</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>Reza</given-names>
            <surname>Rajabiun</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Beyond Transparency: The Semantics of Rulemaking for an Open Internet</article-title>
          .
          <source>Ind. LJ Supp</source>
          .
          <volume>91</volume>
          (
          <year>2015</year>
          ),
          <fpage>33</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>Reza</given-names>
            <surname>Rajabiun</surname>
          </string-name>
          and
          <string-name>
            <given-names>Catherine</given-names>
            <surname>Middleton</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Public Interest in the Regulation of Competition: Evidence from Wholesale Internet Access Consultations in Canada</article-title>
          .
          <source>Journal of Information Policy</source>
          <volume>5</volume>
          (
          <year>2015</year>
          ),
          <fpage>32</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          [54]
          <string-name>
            <surname>Jaromir</surname>
            <given-names>Savelka</given-names>
          </string-name>
          , Vern R Walker, Matthias Grabmair, and
          <string-name>
            <given-names>Kevin D</given-names>
            <surname>Ashley</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Sentence boundary detection in adjudicatory decisions in the united states</article-title>
          .
          <source>Traitement automatique des langues 58</source>
          ,
          <issue>2</issue>
          (
          <year>2017</year>
          ),
          <fpage>21</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          [55]
          <string-name>
            <surname>Ashish</surname>
            <given-names>Vaswani</given-names>
          </string-name>
          , Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
          <string-name>
            <surname>Łukasz Kaiser</surname>
            , and
            <given-names>Illia</given-names>
          </string-name>
          <string-name>
            <surname>Polosukhin</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Attention is all you need</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          .
          <volume>5998</volume>
          -
          <fpage>6008</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          [56]
          <string-name>
            <surname>Yizhong</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sujian</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Jingfeng</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Toward Fast and Accurate Neural Discourse Segmentation</article-title>
          . arXiv preprint arXiv:
          <year>1808</year>
          .
          <volume>09147</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref57">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>Antje</given-names>
            <surname>Witting</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Measuring the use of knowledge in policy development</article-title>
          .
          <source>Central European Journal of Public Policy</source>
          <volume>9</volume>
          ,
          <issue>2</issue>
          (
          <year>2015</year>
          ),
          <fpage>54</fpage>
          -
          <lpage>62</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref58">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>Hui</given-names>
            <surname>Yang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jamie</given-names>
            <surname>Callan</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Near-duplicate detection for eRulemaking</article-title>
          .
          <source>In Proceedings of the 2005 national conference on Digital government research. Digital Government Society of North America</source>
          ,
          <fpage>78</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref59">
        <mixed-citation>
          [59]
          <string-name>
            <given-names>Hui</given-names>
            <surname>Yang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jamie</given-names>
            <surname>Callan</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Ontology generation for large email collections</article-title>
          .
          <source>In Proceedings of the 2008 international conference on Digital government research. Digital Government Society of North America</source>
          ,
          <fpage>254</fpage>
          -
          <lpage>261</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref60">
        <mixed-citation>
          [60]
          <string-name>
            <surname>Yukun</surname>
            <given-names>Zhu</given-names>
          </string-name>
          , Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and
          <string-name>
            <given-names>Sanja</given-names>
            <surname>Fidler</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Aligning books and movies: Towards story-like visual explanations by watching movies and reading books</article-title>
          .
          <source>In Proceedings of the IEEE international conference on computer vision</source>
          . 19-
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>