<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Toward Domain-Guided Controllable Summarization of Privacy Policies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Moniba Keymanesh</string-name>
          <email>keymanesh.1@osu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Micha Elsner</string-name>
          <email>elsner.14@osu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Srinivasan Parthasarathy</string-name>
          <email>parthasarathy.2@osu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The Ohio State University</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>Companies' privacy policies are often skipped by the users as they are too long, verbose, and dificult to comprehend. Identifying the key privacy and security risk factors mentioned in these unilateral contracts and efectively incorporating them in a summary can assist users in making a more informed decision when asked to agree to the terms and conditions. However, existing summarization methods fail to integrate domain knowledge into their framework or rely on a large corpus of annotated training data. We propose a hybrid approach to identify sections of privacy policies with a high privacy risk factor. We incorporate these sections into summaries by selecting the riskiest content from diferent privacy topics. Our approach enables users to select the content to be summarized within a controllable length. Users can view a summary that captures diferent privacy factors or a summary that covers the riskiest content. Our approach outperforms the domain-agnostic baselines by up to 27% in ROUGE-1 score and 50% in METEOR score using plain English reference summaries while relying on significantly less training data in comparison to abstractive approaches.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION AND RELATED WORK</title>
      <p>
        Privacy policy and terms of service are unilateral contracts by which
companies are required to inform users about their data collection,
processing, and sharing practices. Users are required to agree to
abide by the terms before they can use any service. However, many
users do not read or understand these contracts [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Thus, they often
end up consenting to terms that may not be aligned with legislation
such as the General Data Protection Regulation (GDPR)1 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This
behavior is often because these contracts are too long and dificult
to comprehend [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Summarization is an intuitive way to assist users
with conscious agreement by generating a condensed equivalent
of the content. Broadly, there are two main lines of summarization
systems: abstractive and extractive. The abstractive paradigm [
        <xref ref-type="bibr" rid="ref10 ref4 ref5 ref6 ref7 ref8 ref9">4–
10</xref>
        ] aims to create an abstract representation of the input text and
involves various text rewriting operations such as paraphrasing,
deletion, and reordering. The extractive paradigm [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] on the
other hand, creates a summary by identifying and subsequently
concatenating the most important sentences in the document. The
abstractive systems are more flexible while the extractive models
enjoy better factuality [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. However, existing summarization
techniques perform poorly on contracts. Unsupervised methods [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ]
rely on structural features of documents, such as lexical repetition,
to identify and extract important content. These heuristics work
poorly on the legal language used in contracts [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Supervised
methods [
        <xref ref-type="bibr" rid="ref17 ref7 ref9">7, 9, 17</xref>
        ] can learn to cope with the features of a particular
domain. However, training these complex neural summarization
models with thousand of parameters requires a large corpus of
documents and their summaries. Currently existing corpora in the
legal domain are not large enough to train such models. We
propose a hybrid approach for extractive summarization of privacy
contracts: using existing annotated resources, we train a classifier
to predict which pieces of content are most relevant to users [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
In particular, we identify parts of the contract which place users
at risk by imposing unsafe data practices on them, such as selling
email addresses to third parties or allowing the company to
appropriate user-generated content. Next, we use this risk classifier for
content selection within an extractive summarization pipeline. The
classifier is substantially less expensive than learning to
summarize directly but enables our approach to outperform a selection of
domain-agnostic unsupervised summarization methods.
      </p>
      <p>
        Prior computational work on privacy policies has used
information extraction and natural language processing methods to
classify segments of these documents into diferent data practice
categories [
        <xref ref-type="bibr" rid="ref18 ref19 ref20">18–20</xref>
        ]. Another trajectory of work has focused on
presenting a graphical “at-a-glance” description of the privacy policies
to the user. For example, PrivacyGuide [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] and PrivacyCheck [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]
define a few privacy factors and map each factor to a risk level
using a data mining model. Relying on these “at-a-glance”
description methods raises several concerns. First, there is no way for the
user to check the factuality of the predicted risk classes or
interpret the reasoning behind them. Moreover, users tend to have an
easier time comprehending the content when provided in
natural language. Researchers also have focused on assigning a risk
factor–green, yellow, or red–to each segment of the privacy
policies [
        <xref ref-type="bibr" rid="ref23 ref24">23, 24</xref>
        ]. However, summarizing the text may benefit users
more than directly presenting the classifier output. We draw on
these approaches in building our own classifier. The first module of
our framework extends prior work [
        <xref ref-type="bibr" rid="ref23 ref24">23, 24</xref>
        ] to highlight segments
of privacy policies that have a higher risk. We employ a pre-trained
encoder and convolutional neural network to classify sentences
of the contracts into diferent risk levels. To address the
limitations of previous work, we incorporate the domain information
predicted by the classifier in the form of a summary by comparing
a risk-focused and a coverage-focused content selection
mechanism. The coverage-focused selection mechanism aims to reduce
the information redundancy by covering the riskiest sentence from
each privacy topic. We evaluate the efectiveness of employing a
classifier on identifying the domain knowledge for summarization.
We also evaluate the quality of summaries extracted by our two
content selection criteria. Using our approach users can view a
summary that captures diferent privacy factors or a summary that
covers the riskiest content. We release our dataset of 151 privacy
policies annotated with risk labels to assist future research.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>METHODOLOGY</title>
      <p>
        Given a privacy policy document  consisting of a sequence of 
sentences {1, 2, ... } and a sentence budget  such that  &lt; 
our summarization model extracts a risk-aware summary with
 sentences. For each sentence  ∈  we predict a binary label
 (where a value of 1 means  is included in the summary). We
achieve this by computing an inclusion probability  ( |, ,  )
for each sentence  .  are the model’s parameters. We aim to
maximize the inclusion probability for risky sections of the privacy
policies and minimize it for non-risky sections. We also would like
to cover diferent privacy factors within the sentence budget  by
reducing the redundancy. The main intuition behind our proposed
approach is that users when going through the privacy policies are
most interested in knowing how their information can potentially
be abused [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Thus, a condensed equivalent of the terms should
include such risky sections. Next, we explain the architecture or
our risk prediction model and our content selection mechanisms.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Risk Prediction</title>
      <p>
        Given the content of privacy policies, the first step in our
framework is to identify the associated risk class with each sentence of
the contract. We rely on a crowd-sourcing project called TOS;DR2
to automatically annotate 151 privacy contracts. TOSDR has
annotated several snippets of privacy contracts based on the average
Internet user’s perception of risk. We explain our dataset extraction
in section 3. We use this dataset to train our risk classifier. Prior
research has exploited word embeddings and Convolutional Neural
Networks (CNN) for sentence classification [
        <xref ref-type="bibr" rid="ref25 ref26 ref27 ref28">25–28</xref>
        ]. These simple
architectures achieve strong empirical performance over a range of
text classification tasks. Our model is a slight variant of the CNN
architecture proposed in [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
2.1.1 Model architecture. Let   = {1, 2, ... } be the  -th
sentence in the contract  and  ∈  be the d-dimentional vector
representation of token  in this sequence. Word representations
are output of a pretrained encoder [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] and will be discussed in
Section 2.1.2. We build the sentence matrix  ∈ × by concatenating
the word vectors 1 to  :
      </p>
      <p>1: = 1 ⊕ 2 ⊕ ...</p>
      <p>
        Following [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] we apply convolution filters to this matrix to
produce new features. The length of the filters is equal to the
dimensionality of the word vectors . The height or region size of the
iflter is denoted by ℎ and is the number of rows (word vectors) that
are considered jointly when applying the convolution filter. The
feature map  ∈ −ℎ+1 of the convolution operation is then obtained
2https://TOS;DR.org
by repeatedly applying the convolution filter  to a window of
tokens :+ℎ−1. Each element  in feature map  = [1, 2, ...−ℎ+1]
is then obtained from:
      </p>
      <p>
        =  ( .  [ :  + ℎ − 1] + )
where [ :  ] is the sub-matrix of  from row  to  corresponding
to a window of tokens  to   and "." represents the dot product
between the filter  and the sub-matrices.  ∈  represents the
bias term and  is an activation function such as a rectified linear
unit. We use multiple kinds of filters by using various region sizes.
This extracts various types of features from bigrams, trigrams, and
so on. The dimensionality of the feature map  generated by each
convolution filter is diferent for sentences with various lengths
and filters with diferent heights. We apply a max-over-time [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]
pooling operation to downsample each feature map  by taking the
maximum value over the window defined by a pool size . The
maxpooling operation naturally deals with variable sentence lengths.
The outputs generated from each filter map are concatenated to
build a fixed-length feature vector for the penultimate layer. This
feature vector is then fed to a fully connected softmax layer that
predicts a probability distribution over the risk level categories. We
apply dropout [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] as a means of regularization in the softmax layer.
Our objective is to minimize the binary cross-entropy. The trainable
model parameters include the weight vectors  of the filters, the
bias term  in the activation function, and the weight vector of the
softmax function. We minimize the loss using Stochastic gradient
descent and back-propagation [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ].
2.1.2 Pretrained Word Vectors. Prior research indicates that
better word representations can improve performance in a
variety of natural language understanding (NLU) tasks [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. We use
ELMo [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]-a deep contextualized word representation model-to
map each token  in sentence  in contract  to its
corresponding contextual embedding  with length 1024 3. ELMo uses a
bidirectional LSTM [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] for language modeling and considers the
context of the words when assigning them to their embeddings4.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Content Selection and Redundancy</title>
    </sec>
    <sec id="sec-5">
      <title>Reduction</title>
      <p>
        Given the probability distributions over the risk categories, we
apply two content selection mechanisms to account for the
summarization budget  and minimize the information redundancy.
The first mechanism focuses on including the most "risky" sections
while the second mechanism focuses on covering diverse privacy
factors. Next, we explain these two variations of our model.
2.2.1 Risk-Focused Content Selection: Given a privacy policy
contract  with sentences {1, ... }, a summarization budget ,
and risk score  ( = 1|, ,  ) predicted for  by the risk classifier,
the risk-focused selection mechanism assembles a summary by
extracting the top  sentences that have the highest risk score.
3Model was trained on the One billion word benchmark [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] and was obtained from
https://github.com/allenai/allennlp
4BERT [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] as the current state-of-the-art for language model pretraining has achieved
amazing results in many NLU tasks with minimal fine-tuning. However, our
preliminary results of fine-tuning bert did not outperform our results from Elmo word vectors
and task-specific architecture explained in Section 2.1.1.
2.2.2 Coverage-Focused Content Selection: Given a privacy
policy contract  with sentences {1, ... }, a summarization budget
, and risk scores  ( = 1|, ,  ), the coverage-focused selection
method finds  privacy factors by clustering sentences for which
the risk score is larger than a predefined value of  . Next, the riskiest
sentence from each privacy factor cluster is selected to be included
in the summary. Note that if less than  sentences have a risk
score greater than  the summary will have less than  sentences.
To find privacy topics of a contract, we apply k-means [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] to
sentence representations. Sentence representations are obtained
through concatenating the word vectors. Number of clusters is set
to  (, | |) where  = { |  ( = 1) &gt;  }.
3
      </p>
    </sec>
    <sec id="sec-6">
      <title>DATASET EXTRACTION</title>
      <p>In this section, we explain the dataset that we compiled from the
TOS;DR website and privacy contracts of 151 companies. TOS;DR
is a website dedicated to rating and explaining privacy policy of
companies in plain English. Members of the website’s
community classify specific sections of privacy policies into "bad", "good",
"blocker", and "neutral" categories and provide summaries for them.
We collected the user agreement contracts of 151 services that were
annotated on TOS;DR from the companies’ websites. Some
companies have several such contracts e.g. privacy policy, terms of service,
and cookie policy. In this case, all the contracts were merged into a
single document. Next, we compared each sentence of the contract
with specific snippets that were annotated on TOS;DR. If the
corresponding sentence or a very similar sentence was annotated by
the TOS;DR contributors, the same label was used. Otherwise, it
was annotated as "neutral". The assumption behind our annotation
schema is that, if a section was not annotated by the contributors, it
most likely does not include a privacy risk and thus, is considered
neutral. NLTK was used to segment the contracts into sentences.
Jaccard similarity of the vocabulary was used to measure the
similarity of the sentences. Two sentences from the same contract were
considered similar if the Jaccard similarity of their tokens was more
than 50%. We combined the "bad" and "blocker" sections to build the
"risky" class. The "good" and "neutral" classes were also combined
to build the "non-risky" class. This dataset is highly imbalanced
with 61674 non-risky sentences and only 719 risky sentences. To
build the ground truth risk-aware summary of each privacy policy
we concatenate the plain English summaries of the snippets that
have a "risky" label. The dataset statistics of the 151 privacy policies
and their corresponding summaries are presented in Table 1. Our
dataset is available online 5.</p>
      <p>Dataset</p>
      <sec id="sec-6-1">
        <title>Privacy Policies Plain English Summaries</title>
        <p>Min</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>EXPERIMENTS</title>
      <p>In this section, we discuss our data augmentation mechanism to
reduce the data imbalance problem, our hyper parameter choice</p>
      <sec id="sec-7-1">
        <title>5www.github.com/senjed/Summarization-of-Privacy-Policies</title>
        <p>for designing the risk classifier, and the training details. We discuss
our evaluation criteria in Section 4.2.
4.1</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Hyperparameters and Training Details</title>
      <p>
        For the CNN model, we use two filter region sizes 3 and 4 each
of which has 50 output filters. We use rectified linear unit as the
activation function of the convolution layer. The pool size in the
max pooling operation is set to 50. We apply dropout with a rate
of 20%. We optimize the binary cross-entropy loss using stochastic
gradient descent with a learning rate of 0.01. To account for the
class imbalance problem, we randomly under-sampled the majority
class (non-risky) with a rate of 10%. We also apply SMOTE over
sampling [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ] on the minority class (risky) with rate 50%. We train
our model on this resampled dataset for 20 epochs and weight the
loss function inversely proportional to class frequencies in the input
data. To set the value of risk threshold  in the content selection
module, we used the ROC curve of the validation set of each fold.
We set  for each fold to the threshold value that achieves 80% true
positive rate.
4.2
      </p>
    </sec>
    <sec id="sec-9">
      <title>Evaluation Metrics</title>
      <p>
        In our experiments, we seek to answer two questions: i. how well
does our model identify the risky sentences in the contracts? and
ii. what content selection method leads to more "human-like"
summaries? To answer the first question we report the Macro-F1 and
Micro-F1 score of our classifier. To answer the second question,
we evaluate the quality of the extracted summaries by our model
by computing the average F1-score for ROUGE-1, ROUGE-2, and
ROUGE-L [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] metrics (which respectively measure the
unigramoverlap, bigram-overlap, and longest common sequence between
the reference summary and the summary to be evaluated). ROUGE
metrics fail to capture semantic similarity beyond n-grams [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ].
Thus, we also report the METEOR score [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ] which goes beyond
the surface matches and accounts for stems and synonyms while
ifnding the matches. 6 We evaluate our model using 5-fold
crossvalidation. In each fold, contracts of 96 companies are used for
training, 24 contracts are used for validation, and the rest is used
for testing. We explain our baselines in Section 4.3 and our
experimental results in Section 5.
4.3
      </p>
    </sec>
    <sec id="sec-10">
      <title>Summarization Baselines</title>
      <p>
        We compare the performance of our domain-aware extractive
summarization model with the following unsupervised baselines.
Unlike the evaluation setup in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], we run the models on the entire
contract. For methods that require a word limit as the budget, a
compression ratio  is multiplied by the average number of
tokens in all contracts (10488.7) to compute the word limit. Similarly,
the compression ratio of  is multiplied by the average number of
sentences in all contracts (413.1) to build a sentence limit.
• TextRank: An algorithm introduced in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] that uses page
rank to compute an importance score for each sentence.
Sentences with the highest importance score are then extracted
to build a summary until a word limit is satisfied.
6We use pyrouge and NLTK python packages for computing ROUGE and METEOR
values respectively.
Compression Ratio = 1/64
• KLSum: Introduced in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], KLSum aims to minimize the
Kullback-Lieber (KL) divergence between the input
document and proposed summary by greedily selecting sentences.
• Lead-K: A common baseline in news summarization that
extracts the first k sentences of the document until a word
limit is reached.
• Random: This baseline picks random sentences of the
document until a word limit is satisfied. For this baseline, we
report the average results over 10 runs.
• Upper Bound Baseline: This baseline picks all the
sentences in a contract with ground truth label "risky". This
baseline indicates the performance upper bound of an
extractive method on our dataset.
5
      </p>
    </sec>
    <sec id="sec-11">
      <title>RESULTS</title>
      <p>In this section, we discuss our experiments conducted using 5-fold
cross-validation. We shared our training details in Section 4.1. As
an example, summaries extracted by our model and the baselines
from privacy policy of Brainly 7 is displayed in Figure 1. It can be
seen that both of the summaries generated by our method
indicate that third party advertising companies will be able to collect
information about use of Brainly. KLSum misses this information
and the traditional lead-k heuristic which is very efective for news
performs poorly on the contracts. This indicates the advantage of
injecting domain-specific knowledge into content selection.
5.1</p>
    </sec>
    <sec id="sec-12">
      <title>Classification Results:</title>
      <p>In this section, we evaluate the performance of our model discussed
in Section 2.1.1 and study the efect of diferent content selection
mechanism on the risk prediction task. We evaluate our summaries
at two compression ratios of 614 and 116 . The summarization budget
 at each compression ratio  is achieved by multiplying  in the
average number of sentences(or words) in the contracts. Thus, at the
compression ratio of 614 , summaries are restricted to the maximum
length of 6 sentences or 164 words. Similarly, at the compression
ratio of 116 , summaries are limited to the maximum length of 29
sentences or 656 words. We report the precision, recall, Micro-F1, and
Macro-F1 of our risk classifier with two diferent content selection
mechanisms namely risk-focused (RF) and coverage-focused (CF)
in Table 2. As can be seen in the table, the Micro-F1 scores of both
content selection methods are quite high. However, the best
MacroF1 value is achieved by the risk-focused approach and is 61.94. The
large gap between the two values is due to the high level of class
imbalance in our dataset (1 positive sample for every 100 negative
samples). At 614 compression ratio, risk-focused performs more than
7https://Brainly.com
two times better in terms of recall. When the compression ratio
is 116 , the risk-focused method captures many more risky sections
and achieves a recall of 59.74. However, with this increase in
recall, the false positive rate also increases. On the other hand, the
coverage-focused method is better at preserving the precision at
higher budgets (only 7.45 drop in precision with a 28.59 points
increase in recall). This observation is caused by extracting sentences
with a risk score greater than  in coverage-focused content
selection. This naturally puts an upper bound on the false positive rate.
We conclude that both mechanisms are moderately successful at
identifying the risky sections of contracts. We also conclude that at
higher compression ratios, the risk-focused mechanism can be used
where recall is more essential while the coverage-focused
mechanism can be used when precision is more of interest. In the next
section, we examine whether the domain information given by the
risk classifier can improve the quality of summaries in comparison
to domain-agnostic extractive summarization baselines.
5.2</p>
    </sec>
    <sec id="sec-13">
      <title>Summarization Results:</title>
      <p>
        In this section, we evaluate the quality of the summaries extracted
by our model and the baselines. We introduced our evaluation
metrics in Section 4.2 and our baselines in Section 4.3. We compare
the summaries against two type of reference summaries. The first
type of summary is built by assembling all the sentences that have
ground truth "risky" label. These sentences are derived directly
from text of the contract. We will refer to this reference summary
as "quote text" reference. The second type of summary is derived
by assembling the plain English summary of the "risky" sections
written by the TOS;DR contributors. The summarization results
using the quote text summaries is presented in Table 3. The
summarization results using the plain English reference summaries is
presented in Table 4.
5.2.1 Extracting the risky content: As it can be seen in Table 3,
at both compression ratios, both variation of our model outperform
the baselines. At compression ratio of 614 , the CNN + RF, achieves
the best ROUGE and METEOR results with 49.8% improvement
in ROUGE-1, 124.6% improvement in ROUGE-2, 56.3%
improvement in ROUGE-L, and 65.6% improvement in METEOR in
comparison to the best performing domain-agnostic baseline for each
metric. At compression ratio of 116 the CNN + CF achieves the best
ROUGE results by improving ROUGE-1 by 12.2%, ROUGE-2 by
30.2%, ROUGE-L by 8.8%, and METEOR by 23.7% in comparison
the the best performing baseline for each metric. The
improvement in METEOR score is found to be statistically significant using
Wilcoxon signed ranked test [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ] with p-value &lt; 0.01 (Bonferroni
corrected [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ] to account for multiple testing). Similar to our
observation in classification task, we find that the risk-focused content
Plain English Summary: The Privacy Policy states, "We and our third party partners may also use cookies and tracking technologies for advertising
purposes.". In the Privacy Policy, it states that, "Although we do our best to honor the privacy preferences of our users, we are unable to respond to Do
Not Track signals set by your browser at this time." The Privacy Poilicy says Brainly can track usage information and personal information "through a
variety of tracking technologies, including cookies, web beacons, Locally Stored Objects (LSOs such as Flash or HTML5), log files, and similar technology
(collectively, “tracking technologies”)." If Brainly aims to "preserve all content posted on the site," then we can conclude that such personal data is still
necessary for the purpose of the site. There are places on the site where answers without usernames or profile pictures are visible. The Cookie Policy
states, "Service oparator [sic] informs that restricting the use of cookies may afect some of the functionalities available on the Website." For users not in
europe, brainly reserves the right, in its sole discretion, to immediately modify, suspend or terminate your account, the brainly services, your brainly
subscription, and/or any products, services, functionality, information, content or other material. &lt;truncated&gt;
CNN + RF: We participate in interest-based advertising and use third party advertising companies to serve you targeted advertisements based on your
online browsing history and your interests. We permit third party online advertising networks, social media companies and other third party services,
to collect, information about your use of our service over time so that they may play or display ads on our service, on other websites, apps or services
you may use, and on other devices you may use. We may share a common account identifier (such as an email address or user id) or hashed data with
our third party advertising partners to help identify you across devices. Brainly reserves the right to moderate the Brainly services and to remove, screen,
or edit your content from the Brainly services at our sole discretion, at any time, and for any reason or for no reason, with no notice to you. Brainly
reserves the right, in its sole discretion, to immediately modify, suspend or terminate your account, the Brainly services, your Brainly subscription,
and/or any products, services, functionality, information, content or other materials available on, through or in connection with the Brainly services
and/or your Brainly subscription, including, but not limited to, the mobile software, and/or your access to some or all of them without cause and without
notice. In the event that Brainly suspends or terminates your account, the Brainly services or your Brainly subscription, you acknowledge and agree
that you shall receive no refund or exchange for any unused time on a Brainly subscription or any subscription fees or anything else.
CNN + CF: We participate in interest-based advertising and use third party advertising companies to serve you targeted advertisements based on
your online browsing history and your interests. We permit third party online advertising networks, social media companies and other third party
services, to collect, information about your use of our service over time so that they may play or display ads on our service, on other websites, apps
or services you may use, and on other devices you may use. We may share a common account identifier (such as an email address or user id) or hashed
data with our third party advertising partners to help identify you across devices. To the fullest extent permitted by applicable law, no arbitration or
claim under these terms shall be joined to any other arbitration or claim, including any arbitration or claim involving any other current or former user
of the Brainly services or a Brainly subscription, and no class arbitration proceedings shall be permitted. We may modify or update this privacy policy
from time to time to reflect the changes in our business and practices, and so you should review this page periodically. If you object to any changes,
you may close your account. Continuing to use our service after we publish changes to this privacy policy means that you are consenting to the changes.
Lead-K: Welcome to Brainly!. Brainly operates a group of social learning networks for students and educators. Brainly inspires students to share and
explore knowledge in a collaborative community and engage in peer-to-peer educational assistance, which is made available on www.Brainly.com and
any www.Brainly.com sub-domains(the “website”) as well as the Brainly.com mobile application (the “app”) (the “website” and the “app” are collectively
the “Brainly services”. We have two sets of terms and conditions: part(a) sets out the terms that apply to our users unless you are based in Europe and
part (b) sets out the terms that apply to our users in Europe. It is important that you read and understand the terms that apply to you when you use
the Brainly services before using the Brainly services. Part (a): terms and conditions applicable to users unless you are based in Europe. This part and
the documents referred to within it set out the terms and conditions that apply to your use of Brainly services if you access Brainly services from within
the united states or other countries except Europe. The Cookie Policy states, "Service oparator [sic] informs that restricting the use of cookies may
afect some of the functionalities available on the Website."
KLSum: Brainly reserves the right, in its sole discretion, to immediately modify, suspend or terminate your account, the Brainly services, your Brainly
subscription, and/or any products, services, functionality, information, content or other materials available on, through or in connection with the Brainly
services and/or your Brainly subscription, including, but not limited to, the mobile software, and/or your access to some or all of them without cause and
without notice. Brainly makes no warranty that the Brainly services and/or any products, services, functionality, information, content or other materials
available on, through or in connection with the Brainly services or your Brainly subscription, including, but not limited to, the mobile software, will meet
your requirements, or that the Brainly services or Brainly subscriptions will operate uninterrupted or in a timely, secure, or error-free manner, or as to the
accuracy or completeness of any information or content accessible from or provided in connection with the Brainly services or Brainly subscriptions,
regardless of whether any information or content is marked as “verified”. You must not: use Brainly services other than for its intended purpose as set out
in the terms of use; &lt;truncated for presentation purpose. Rest of the summary includes examples of misuse of the Brainly services.&gt;
selection achieves more recall and thus, achieves a better METEOR
score in comparison to the coverage-focused mechanism. On the
other hand, by increasing the summarization budget, the ROUGE
values for this method slightly drop. This is because, in most of the
contracts, the number of risky sentences is smaller than the budget
at ratio of 116 (29 sentences).
5.2.2 Building Human-like summaries: We present our
summarization results using the plain English summaries as reference
summaries in Table 4. At compression ratio of 614 , both variations of
Compression Ratio = 1/64
our model outperform the baselines. Our CNN + RF model, increases
the METEOR score by 32.2% over KLSum and 48% over textrank.
      </p>
      <p>
        This improvement is found to be statistically significant (with
pvalue &lt; 0.01). The CNN + CF outperforms the baselines over all
evaluation metrics. However, the improvement is not statistically
significant. At compression ratio of 116 , CNN + RF outperforms all
domain-agnostic baselines. This improvement however, is not
statistically significant. At this compression ratio, CNN + RF achieves
comparable result with textrank. We conclude from our experiments
that our domain-aware extractive model does moderately better
than the baselines at lower compression ratios, however, due to
high level of abstraction in plain English summaries of TOS;DR [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ],
a fully-extractive approach cannot mimic the human-like qualities
in the plain English summaries. This can also be seen by looking at
the performance of the upper bound baseline.
6
      </p>
    </sec>
    <sec id="sec-14">
      <title>CONCLUSION AND DISCUSSION</title>
      <p>
        In this paper, we proposed a domain-aware extractive model for
summarizing the privacy contracts. Our model, employs a
convolutional neural network to identify risky sections of the contracts. We
build summaries by using a risk-focused and a coverage-focused
content selection mechanism. Our approach enables users to select
the content to be summarized within a controllable length while
relying on substantially less training data in comparison to the
existing supervised summarization methods. Our two diferent content
selection mechanisms enable users to build budgeted summaries
of contracts based on their preference of coverage vs risk. In spite
of the moderate success in classification of our realistically
imbalanced dataset, we observed a noticeable improvement in ROUGE
and METEOR metrics in comparison to domain agnostic baselines.
We believe the summaries generated by our method can be
improved in multiple ways. First, the classifier itself, and the
redundancy reduction system, could be improved, bringing content
selection performance closer to the upper bound scores derived using
a perfect classifier. Secondly, our summaries would be more
accessible if written in plain English rather than legalese [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. An
abstractive system could be used to rewrite the contract text in
this way. However, the abstractive summaries should not change
the legal interpretation of the content and should be linkable to
the original content to be considered binding. In addition to
improving the system, it is also necessary to conduct more extensive
evaluation experiments, involving human readers as well as
automated metrics. This will help determine the most efective ways to
present information from click-through contracts so that users can
understand their terms and make a more informed decision. We are
planning to explore if the risk classifier module can be used
independently to enhance the productivity of annotators by identifying the
sections that need to be summarised. This can potentially facilitate
annotating larger resources for training abstractive models.
      </p>
    </sec>
    <sec id="sec-15">
      <title>ACKNOWLEDGEMENT</title>
      <p>We are immensely grateful to Prof. Junyi Jessy Li, Prof. Bryan
H. Choi, Dr. Daniel Preoţiuc-Pietro, Mayank Kulkarni, and three
anonymous reviewers for valuable discussions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Lorrie</given-names>
            <surname>Faith</surname>
          </string-name>
          <string-name>
            <surname>Cranor</surname>
          </string-name>
          , Praveen Guduru, and
          <string-name>
            <given-names>Manjula</given-names>
            <surname>Arjula</surname>
          </string-name>
          .
          <article-title>User interfaces for privacy agents</article-title>
          .
          <source>TOCHI</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Jonathan</surname>
            <given-names>A</given-names>
          </string-name>
          <string-name>
            <surname>Obar and Anne</surname>
          </string-name>
          Oeldorf-Hirsch.
          <article-title>The biggest lie on the internet: Ignoring the privacy policies and terms of service policies of social networking services</article-title>
          .
          <source>ICS</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Aleecia</surname>
            <given-names>M McDonald</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lorrie Faith Cranor.</surname>
          </string-name>
          <article-title>The cost of reading privacy policies</article-title>
          .
          <source>Isjlp</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Alexander</surname>
            <given-names>M Rush</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sumit</given-names>
            <surname>Chopra</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jason</given-names>
            <surname>Weston</surname>
          </string-name>
          .
          <article-title>A neural attention model for abstractive sentence summarization</article-title>
          .
          <source>arXiv:1509.00685</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Ramesh</given-names>
            <surname>Nallapati</surname>
          </string-name>
          , Bowen Zhou,
          <string-name>
            <given-names>Caglar</given-names>
            <surname>Gulcehre</surname>
          </string-name>
          , et al.
          <article-title>Abstractive text summarization using sequence-to-sequence rnns and beyond</article-title>
          .
          <source>arXiv:1602.06023</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Qian</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Xiao-Dan</surname>
            <given-names>Zhu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhen-Hua</surname>
            <given-names>Ling</given-names>
          </string-name>
          , Si Wei, and
          <string-name>
            <given-names>Hui</given-names>
            <surname>Jiang</surname>
          </string-name>
          .
          <article-title>Distractionbased neural networks for modeling document</article-title>
          .
          <source>In IJCAI</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Abigail</given-names>
            <surname>See</surname>
          </string-name>
          ,
          <string-name>
            <surname>Peter J Liu</surname>
            , and
            <given-names>Christopher D</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Get to the point: Summarization with pointer-generator networks</article-title>
          .
          <source>arXiv:1704.04368</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Jiwei</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xiaojun</given-names>
            <surname>Wan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jianguo</given-names>
            <surname>Xiao</surname>
          </string-name>
          .
          <article-title>Abstractive document summarization with a graph-based attentional neural model</article-title>
          .
          <source>In ACL</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Romain</given-names>
            <surname>Paulus</surname>
          </string-name>
          , Caiming Xiong, and
          <string-name>
            <given-names>Richard</given-names>
            <surname>Socher</surname>
          </string-name>
          .
          <article-title>A deep reinforced model for abstractive summarization</article-title>
          .
          <source>arXiv preprint arXiv:1705.04304</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ritesh</surname>
            <given-names>Sarkhel*</given-names>
          </string-name>
          , Moniba Keymanesh*,
          <string-name>
            <given-names>Arnab</given-names>
            <surname>Nandi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Srinivasan</given-names>
            <surname>Parthasarathy</surname>
          </string-name>
          .
          <article-title>Transfer learning for abstractive summarization at controllable budgets</article-title>
          . arXiv:
          <year>2002</year>
          .07845,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Ramesh</surname>
            <given-names>Nallapati</given-names>
          </string-name>
          , Feifei Zhai, and
          <string-name>
            <given-names>Bowen</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <article-title>Summarunner: A recurrent neural network based sequence model for extractive summarization of documents</article-title>
          .
          <source>In AAAI</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Michihiro</surname>
            <given-names>Yasunaga</given-names>
          </string-name>
          , Rui Zhang, Kshitijh Meelu, Ayush Pareek, Krishnan Srinivasan, and
          <string-name>
            <given-names>Dragomir</given-names>
            <surname>Radev</surname>
          </string-name>
          .
          <article-title>Graph-based neural multi-document summarization</article-title>
          .
          <source>arXiv preprint arXiv:1706.06681</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Ziqiang</surname>
            <given-names>Cao</given-names>
          </string-name>
          , Furu Wei,
          <string-name>
            <given-names>Wenjie</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Sujian</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <article-title>Faithful to the original: Fact aware neural abstractive summarization</article-title>
          .
          <source>In AAAI</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Rada</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          and
          <string-name>
            <given-names>Paul</given-names>
            <surname>Tarau</surname>
          </string-name>
          . Textrank:
          <article-title>Bringing order into text</article-title>
          .
          <source>In EMNLP</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Aria</given-names>
            <surname>Haghighi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lucy</given-names>
            <surname>Vanderwende</surname>
          </string-name>
          .
          <article-title>Exploring content models for multidocument summarization</article-title>
          .
          <source>In NAACL</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Laura</given-names>
            <surname>Manor</surname>
          </string-name>
          and
          <article-title>Junyi Jessy Li</article-title>
          .
          <article-title>Plain english summarization of contracts</article-title>
          . arXiv:
          <year>1906</year>
          .00424,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Gehrmann</surname>
          </string-name>
          , Yuntian Deng, and
          <string-name>
            <surname>Alexander</surname>
            <given-names>M Rush.</given-names>
          </string-name>
          <article-title>Bottom-up abstractive summarization</article-title>
          .
          <source>arXiv preprint arXiv:1808.10792</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Frederick</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Shomir Wilson,
          <string-name>
            <surname>Peter Story</surname>
          </string-name>
          , et al.
          <article-title>Towards automatic classification of privacy policy text</article-title>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Shomir</surname>
            <given-names>Wilson</given-names>
          </string-name>
          , Florian Schaub, Aswarth Abhilash Dara, Frederick Liu, et al.
          <article-title>The creation and analysis of a website privacy policy corpus</article-title>
          .
          <source>In ACL</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Zimmeck and Steven M Bellovin. Privee</surname>
          </string-name>
          :
          <article-title>An architecture for automatically analyzing web privacy policies</article-title>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Welderufael</surname>
            <given-names>B Tesfay</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peter Hofmann</surname>
            , Toru Nakamura, Shinsaku Kiyomoto, and
            <given-names>Jetzabel</given-names>
          </string-name>
          <string-name>
            <surname>Serna</surname>
          </string-name>
          . Privacyguide:
          <article-title>Towards an implementation of the eu gdpr on internet privacy policy evaluation</article-title>
          .
          <source>In IWSPA</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Razieh</given-names>
            <surname>Nokhbeh</surname>
          </string-name>
          <string-name>
            <surname>Zaeem</surname>
          </string-name>
          , Rachel L German, and
          <string-name>
            <given-names>K Suzanne</given-names>
            <surname>Barber. Privacycheck</surname>
          </string-name>
          :
          <article-title>Automatic summarization of privacy policies using data mining</article-title>
          .
          <source>TOIT)</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Najmeh</given-names>
            <surname>Mousavi</surname>
          </string-name>
          <string-name>
            <surname>Nejad</surname>
          </string-name>
          , Damien Graux, and Diego Collarana.
          <article-title>Towards measuring risk factors in privacy policies</article-title>
          .
          <source>In ICAIL</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Hamza</surname>
            <given-names>Harkous</given-names>
          </string-name>
          , Kassem Fawaz,
          <string-name>
            <given-names>Rémi</given-names>
            <surname>Lebret</surname>
          </string-name>
          , et al.
          <article-title>Polisis: Automated analysis and presentation of privacy policies using deep learning</article-title>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Ronan</surname>
            <given-names>Collobert</given-names>
          </string-name>
          , Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and
          <string-name>
            <given-names>Pavel</given-names>
            <surname>Kuksa</surname>
          </string-name>
          .
          <article-title>Natural language processing (almost) from scratch</article-title>
          .
          <source>JMLR</source>
          ,
          <volume>12</volume>
          (Aug):
          <fpage>2493</fpage>
          -
          <lpage>2537</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Yoon</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Convolutional neural networks for sentence classification</article-title>
          .
          <source>arXiv:1408.5882</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Nal</surname>
            <given-names>Kalchbrenner</given-names>
          </string-name>
          , Edward Grefenstette, and
          <string-name>
            <given-names>Phil</given-names>
            <surname>Blunsom</surname>
          </string-name>
          .
          <article-title>A convolutional neural network for modelling sentences</article-title>
          .
          <source>arXiv preprint arXiv:1404.2188</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Ye</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Byron</given-names>
            <surname>Wallace</surname>
          </string-name>
          .
          <article-title>A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification</article-title>
          .
          <source>arXiv:1510.03820</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Matthew</surname>
            <given-names>E Peters</given-names>
          </string-name>
          , Mark Neumann,
          <string-name>
            <given-names>Mohit</given-names>
            <surname>Iyyer</surname>
          </string-name>
          , et al.
          <article-title>Deep contextualized word representations</article-title>
          .
          <source>arXiv:1802.05365</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Geofrey</surname>
            <given-names>E</given-names>
          </string-name>
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and
          <string-name>
            <surname>Ruslan R Salakhutdinov.</surname>
          </string-name>
          <article-title>Improving neural networks by preventing co-adaptation of feature detectors</article-title>
          .
          <source>arXiv preprint arXiv:1207.0580</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>David</surname>
            <given-names>E Rumelhart</given-names>
          </string-name>
          , Geofrey E Hinton, and
          <string-name>
            <surname>Ronald J Williams</surname>
          </string-name>
          .
          <article-title>Learning representations by back-propagating errors</article-title>
          .
          <source>nature</source>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Matthew</surname>
            <given-names>E Peters</given-names>
          </string-name>
          , Waleed Ammar, Chandra Bhagavatula, and
          <string-name>
            <given-names>Russell</given-names>
            <surname>Power</surname>
          </string-name>
          .
          <article-title>Semi-supervised sequence tagging with bidirectional language models</article-title>
          .
          <source>arXiv preprint arXiv:1705.00108</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Ciprian</surname>
            <given-names>Chelba</given-names>
          </string-name>
          , Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and
          <string-name>
            <given-names>Tony</given-names>
            <surname>Robinson</surname>
          </string-name>
          .
          <article-title>One billion word benchmark for measuring progress in statistical language modeling</article-title>
          .
          <source>arXiv preprint arXiv:1312.3005</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Mike</given-names>
            <surname>Schuster</surname>
          </string-name>
          and
          <article-title>Kuldip K Paliwal. Bidirectional recurrent neural networks</article-title>
          .
          <source>IEEE transactions on Signal Processing</source>
          ,
          <volume>45</volume>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Jacob</surname>
            <given-names>Devlin</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei Chang</surname>
          </string-name>
          , et al.
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv:
          <year>1810</year>
          .04805,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Tapas</surname>
            <given-names>Kanungo</given-names>
          </string-name>
          , David M Mount, Nathan S Netanyahu,
          <string-name>
            <surname>Christine D Piatko</surname>
            ,
            <given-names>Ruth</given-names>
          </string-name>
          <string-name>
            <surname>Silverman</surname>
          </string-name>
          , and
          <string-name>
            <surname>Angela Y Wu</surname>
          </string-name>
          .
          <article-title>An eficient k-means clustering algorithm: Analysis and implementation</article-title>
          .
          <source>IEEE TPAMI</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Nitesh</surname>
            <given-names>V Chawla</given-names>
          </string-name>
          , Kevin W Bowyer, Lawrence O Hall, and
          <string-name>
            <given-names>W Philip</given-names>
            <surname>Kegelmeyer</surname>
          </string-name>
          .
          <article-title>Smote: synthetic minority over-sampling technique</article-title>
          .
          <source>JAIR</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>Chin-Yew Lin</surname>
            and
            <given-names>Eduard</given-names>
          </string-name>
          <string-name>
            <surname>Hovy</surname>
          </string-name>
          .
          <article-title>Manual and automatic evaluation of summaries</article-title>
          .
          <source>In ACL</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Jin-ge Yao</surname>
            , Xiaojun Wan, and
            <given-names>Jianguo</given-names>
          </string-name>
          <string-name>
            <surname>Xiao</surname>
          </string-name>
          .
          <article-title>Recent advances in document summarization</article-title>
          .
          <source>Knowledge and Information Systems</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Denkowski</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alon</given-names>
            <surname>Lavie</surname>
          </string-name>
          .
          <article-title>Meteor universal: Language specific translation evaluation for any target language</article-title>
          .
          <source>In Proceedings of the ninth workshop on statistical machine translation</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <surname>Frank</surname>
            <given-names>Wilcoxon</given-names>
          </string-name>
          , SK Katti,
          <article-title>and Roberta A Wilcox. Critical values and probability levels for the wilcoxon rank sum test and the wilcoxon signed rank test</article-title>
          .
          <source>Selected tables in mathematical statistics</source>
          ,
          <year>1970</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Charles</surname>
            <given-names>W</given-names>
          </string-name>
          <string-name>
            <surname>Dunnett</surname>
          </string-name>
          .
          <article-title>New tables for multiple comparisons with a control</article-title>
          .
          <source>Biometrics</source>
          ,
          <year>1964</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>