<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>GDPR Privacy Policies in CLAUDET TE: Challenges of Omission, Context and Multilingualism</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ru¯ ta Liepin, a</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Contissa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kasper Drazewski</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Lagioia</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Lippi</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Przemysław Pałka</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Sartor</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hans-Wolfgang Micklitz</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Torroni</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIRSFID, University of Bologna</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>DISI, University of Bologna</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>DISMI, University of Modena and</institution>
          ,
          <addr-line>Reggio Emilia</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>EUI</institution>
          ,
          <addr-line>Florence, Italy, CIRSFID</addr-line>
          ,
          <institution>University of Bologna</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Law Department, EUI</institution>
          ,
          <addr-line>Florence</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Yale Law School</institution>
          ,
          <addr-line>New Haven</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>21</volume>
      <issue>2019</issue>
      <abstract>
        <p>The latest developments in natural language processing and machine learning have created new opportunities in legal text analysis. In particular, we look at the texts of online privacy policies after the implementation of the European General Data Protection Regulation (GDPR). We analyse 32 privacy policies to design a methodology for automated detection and assessment of compliance of these documents. Preliminary results confirm the pressing issues with current privacy policies and the beneficial use of this approach in empowering consumers in making more informed decisions. However, we also encountered several serious issues in the process. This paper introduces the challenges through concrete examples of context dependence, omission of information, and multilingualism.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        The changes in online privacy policies following the European
General Data Protection Regulation (GDPR) have further
highlighted the increasing information asymmetry between online
service providers and consumers. Studies [
        <xref ref-type="bibr" rid="ref3 ref5">3, 5</xref>
        ] in consumer behaviour
in reading privacy policies show that long and complex legal
documents are seldom read and understood by users. Moreover, [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
show that comprehending the rights and obligations outlined in
these online documents is costly both in terms of time and monetary
value.
      </p>
      <p>
        This paper presents a work in progress that includes the latest
developments of our methodology [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] in designing the Gold Standard
of privacy policy compliance that could be used to build a platform
empowering consumers to gain easier access and support in
understanding their rights and obligations. We aim to provide such a
solution through the use of legal analysis, natural language
processing, and machine learning. In Section 4, we describe three challenges
faced by the AI and Law researchers working on automating
evaluation of legal documents and illustrate them through examples
found in the privacy policies analysed in our study. Among other
issues, we focus on the problem of context dependence of (legal)
terms, the challenges in formalising the privacy policies due to their
linguistic and legal complexity, and the need for methodologies
that can be transferred between diferent European languages.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>BACKGROUND</title>
      <p>
        Legal texts, such as regulations, contracts, privacy policies, and
cases, provide a rich source for diferent formal analyses, due to
the complexity of language and legal norms within those texts.
One of the aims of artificial intelligence and law research [
        <xref ref-type="bibr" rid="ref10 ref8">8, 10</xref>
        ]
is to find methods for accurately and eficiently extracting the
knowledge from legal texts and for providing a level of evaluation
for the extracted data. This paper focuses on the legal texts of online
privacy policies. We identified three main dimensions for evaluation
based on the GDPR and its guidelines: completeness, compliance
with the data processing rules, and level of readability. A selection
of the research studies in these fields is introduced below.
      </p>
      <p>
        Completeness: one of the core criticisms against unfair privacy
policies regards withheld or missing information on the data
processing, such as the purpose and retention time of personal data,
including sensitive data. Constante et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] use machine learning
and pre-annotated privacy policies to check for the completeness
of information pre-GDPR. To this end, they designed a client-end
solution, allowing consumers to read summarised policies on
privacy categories of their choice (6 core categories and 11 additional
categories).
      </p>
      <p>
        Compliance: service providers, consumers and law enforcement
authorities are interested in assessing the compliance of online
privacy policies. However, it has proven to be a challenging task.
Research in this area focuses on formalising legal norms [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and
designing methodologies [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] for automating the assessment of
privacy policies. One of the risks identified [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] relates to the
misinterpretation of norms as well as to the failure in connecting diferent
specifications of norms within a legal document.
      </p>
      <p>
        Readability: a diferent area of research focuses on the language
and accessibility of privacy policies. A new study [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] provides
empirical evidence on the readability levels of privacy policies
postGDPR, concluding that “these policies are often unreadable”.1
Following previous work by [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], their results support the conclusion
1For readability scores the study employed the Flesch Reading Ease (FRE) test and the
Flesch-Kincaid (F-K) test.
that an unreasonable level of expertise is required to comprehend
the privacy policies. The average score, among the 300 analysed
policies, was at a level of “the usual score of articles in academic
journals” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], supporting the claim that policies are not written
to be accessible and understandable by the general public. Such
barriers further discourage consumers from reading privacy
policies [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Some solutions, such as automatically generated privacy
policy summaries [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and interactive solutions of privacy analysis
through apps [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], are emerging to provide consumers with tools
to better understand the contents of agreements and exercise their
rights.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>DESIGNING METHODOLOGY</title>
      <p>
        This project aims to design a methodology for creating an open
and high quality annotated corpus of online privacy policies. Such
a data set could be used for automated detection and evaluation of
problematic privacy clauses given the GDPR as the basis for
integrated normative guidelines. Here, we present an overview of the
current methodology for detecting and assessing the problematic
privacy clauses, and how the new guidelines have improved on
previous versions [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
3.1
      </p>
    </sec>
    <sec id="sec-4">
      <title>The Gold Standard</title>
      <p>We designed a methodology that reflects the overall aims of the
GDPR in regards to collection and processing of personal data. In
particular, we focus on three ways a privacy policy can be deemed
unlawful according to articles 13 and 14 of the GDPR: (1) if the
policy omits information required by the regulation, (2) if the policy
defines data processing beyond the prescribed limits, and (3) if it is
written in unclear language.</p>
      <p>The Gold Standard</p>
      <p>Comprehensive
information provided</p>
      <p>Substantive
compliance
23 categories
e.g. &lt;purp&gt; for the
purposes of data processing</p>
      <p>11 categories
e.g. &lt;ad&gt; for the use of
personal data for ads</p>
      <p>Clear expression
clear language not
tagged; &lt;vag&gt; for
unclear expressions
Optimal or sub-optimal
depending on whether
suficient info included</p>
      <p>Fair processing,
problematic processing,
or unfair processing
4 indicators: conditionals,
generalisations, modality,
non-spec. quantifiers
(1) comprehensiveness of information
(2) substantive compliance
(3) clarity of expression</p>
      <p>
        Each of the top-level dimensions has been further divided into
the relevant categories and corresponding evaluation criteria.
Diagram 1 shows the layered structure of the methodology by
exemplifying a good privacy policy: one that satisfies all the criteria. 2
To meet the requirements of comprehensiveness, a privacy policy
should declare the purposes of the processing precisely and
exhaustively. Thus, clauses providing only examples must be considered
as insuficiently informative. In the dimension of substantive
compliance, using personal data for targeted advertising is fair only
if based on the data subject’s consent and whenever an opt-out
is possible. Regarding the clarity of expression, i.e. whether a
privacy policy is framed in understandable, precise, and intelligible
language, certain unspecific language qualifiers should be avoided
(e.g. indeterminate conditioners, creating a dependency of a stated
action or activity on a variable trigger such as “as necessary”, “from
time to time”, etc). We have designed detailed annotation guidelines
that are being further tested with a new data set of policies.
(1) Comprehensiveness of Information. The clause satisfies the
criteria if the privacy policy includes suficient information on the 23
categories defined in the annotation guidelines. These include: &lt;id&gt;
identity of the data controller, &lt;cat&gt; categories of personal data
concerned, and &lt;ret&gt; the period for which the personal data will be
stored. Where ‘suficiency’ is defined as fully informative privacy
clauses that include all the details required by the regulation (e.g.
&lt;id1&gt;). Everything that does not satisfy the given criteria, as
speciifed in the guidelines has been marked as sub-optimal (e.g. &lt;id2&gt;).
We use the numerical values of 1 and 2 in the XML tags to refer to
the level of comprehensiveness of the information given. The earlier
version of the methodology distinguished 12 relevant categories.
The number of categories was increased to 23 to provide a more
ifne-grained annotation of functions. The improvements from the
previous annotation guidelines [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] consist of the further
specification of the diferent functions of the rights granted to consumers,
and the steps needed to exercise them. In particular, the clauses
implementing the duty to inform the data subject about their rights,
under article 13.2(b) and 14.2(c) of the GDPR, initially falling under
a single category of required information[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and identified with
the &lt;correct&gt; tag, have been distinguished in multiple categories.
The reason for further diferentiating between such categories is
twofold. Firstly, from the legal point of view, the right to request
access to, and rectification or erasure of, personal data or restriction
of processing and to object to processing, as well as the right to data
portability, are conceptually distinct and independent. Secondly, in
analysing the privacy policies, we noted that the diferent rights
and steps needed to exercise these rights are usually addressed in
separate clauses. Thus we chose the units for our tagging method
as single phrases. Indeed, with clauses covering multiple sentences,
we chose to tag each sentence separately, by treating statements
independently from one another. Hence, also the clauses
containing information about the rights are now classified separately from
those outlining the steps needed to exercise these rights. Consider,
for instance, the following example:
      </p>
      <p>You can request access to your personal
information, or correct or update out-of-date
2In the diagram, the underlined criteria illustrate a good privacy policy.
or inaccurate personal information we hold
about you. You can most easily do this by
visiting the "Account" portion of our
website, where you have the ability to access
and update a broad range of information
about your account, including your contact
information, your Netflix payment
information, and various related information about
your account (such as the content you have
viewed and rated, and your reviews.</p>
      <p>Under the previous version of the tagging guidelines, the two
clauses, considered separately, were not deemed as exhaustive with
regard to the initial &lt;correct&gt; category and were marked as
insufifciently informative (for instance, the first clause fails to inform the
data subject about the existence of the right to object to processing,
as well as about the right to data portability). In the example below,
we illustrate how we now further distinguish &lt;acc&gt; for the right
to request access to personal data from the data controller, &lt;corr&gt;
the right to request the rectification of personal data, &lt;cat&gt; the
categories of personal data concerned, and &lt;sacc&gt; the steps needed
to exercise the right to access their personal data.</p>
      <p>[Current version]&lt;acc2&gt;&lt;corr2&gt;&lt;cat2&gt;You can
request access to your personal information,
or correct or update out-of-date or
inaccurate personal information we hold about
you.&lt;/cat2&gt;
&lt;/corr2&gt;&lt;/acc2&gt;
&lt;sacc1&gt;&lt;acc1&gt;&lt;corr1&gt;You can most easily do
this by visiting the "Account" portion of
our website, where you have the ability to
access and update a broad range of
information about your account, including your
contact information, your Netflix payment
information, and various related
information about your account (such as the
content you have viewed and rated, and your
reviews).&lt;/corr1&gt;&lt;/acc1&gt;&lt;/sacc1&gt;</p>
      <p>The 23 category guidelines for comprehensiveness of
information are currently being tested against the hypothesis that the added
categories will enhance the precision of answers given to the
consumers.
(2) Substantive Compliance. In dimension of substantive compliance,
we distinguish 11 categories of clauses pertaining to the types of
processing. A clause is considered fair if the defined data processing
practices are permitted by, and thus compliant with, the GDPR
(Art.5, 6, and 9). We assumed that each clause can be classified either
as a fair processing clause &lt;tag1&gt;, problematic processing &lt;tag2&gt;,
or unfair processing &lt;tag3&gt; clause. We used the numerical values of
1, 2, and 3 for each XML tag to indicate the level of fairness. In this
dimension, the two levels of sub-optimal achievement of the Gold
Standard distinguish between problematic clauses, where it may be
reasonably doubted that the clause meets the GDPR requirements,
and unfair clauses, where the data processing clearly fails to meet
the GDPR requirements, i.e. the data processing defined in the
policy document is forbidden by the regulation.</p>
      <p>
        We identified 11 categories of clauses based on how issues
pertaining to such categories might afect individual rights. For
instance, the unfair processing of sensitive (&lt;sens&gt;) data, or
unauthorised transfer of data to third parties (tp) can have negative
consequences for the consumer. Other categories pertain to the
consent by using practice, the take it or leave it approach, policy
changes and whether there has been a fair warning, cross-border
data transfer, consent for processing children’s data, licensing data,
advertising, any other types of consent, as well as one category for
tracking any other types of problematic clauses.
(3) Clarity of Expression. Art 12 specifies that a privacy policy
should be framed “in a concise, transparent, intelligible and
easily accessible form, using clear and plain language”. To integrate
this requirement into the assessment criteria, four indicators for
vagueness (categories of linguistic expressions possibly generating
indeterminacy, depending on the context) were defined [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]: (1)
indeterminate conditioners, creating a dependency of a stated
action or activity on a variable trigger, such as “as necessary”, “from
time to time”, etc.; (2) expression generalisations, abstracting
actions and activities under unclear conditions and contexts, such as
“generally”, “normally”, “ largely”, “often”, etc.; (3) modality,
including adverbs and non-specific adjectives, which create uncertainty
with respect to the possibility of certain actions and events, and
(4) nonspecific numeric quantifiers, creating ambiguity as to the
actual measure of a certain action and activity, such as “numerous”,
“some”, “most”, “many”, “including (but not limited to)”, etc. Note
that a single clause may fall into diferent categories, in diferent
dimensions, and consequently may have multiple tags. For example,
if the clause allows for a problematic processing of sensitive data
and includes vague terms, it is marked as:
      </p>
      <p>&lt;sens&gt;&lt;vag&gt;The sentence.&lt;/vag&gt;&lt;/sens&gt;
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>A Preliminary Corpus</title>
      <p>
        In the privacy policy assessment, we worked with a corpus of 32
policies, manually tagged by two independent annotators. Privacy
policies were selected on the basis of the number of users and
the platform’s global relevance, as well as taking into account our
previous work [
        <xref ref-type="bibr" rid="ref12 ref6">6, 12</xref>
        ] analysing Terms of Services for the same
online services. We used XML mark-up language for annotations.
      </p>
      <p>The data set contains 6,275 sentences. As we observed above,
the sentences were tagged according to 35 categories (23 under the
comprehensiveness of information dimension, 11 under substantive
compliance, and 1 under clarity of expression). In the remainder of
the paper we will only mention some of these categories and we
will report on experiments concerning three categories (&lt;purp&gt;,
&lt;ad&gt;, and &lt;vag&gt;): one for each dimension of the Gold Standard
defined in Section 3.1. &lt;purp&gt; for the comprehensiveness of
information, &lt;ad&gt; for substantive compliance, and finally &lt;vag&gt; for
unclear language. The corpus contains 773 sentences tagged with
&lt;purp&gt;, out of which 281 and 492 sentences refer to cases of
sufifcient ( &lt;purp1&gt;) and partial (&lt;purp2&gt;) information, respectively.
As for advertising, 91 sentences in the corpus are tagged as
problematic (&lt;ad2&gt;) whereas 95 are tagged as unfair (&lt;ad3&gt;). Finally,
714 sentences are tagged as unclear (&lt;vag&gt;).</p>
      <p>We hereby remark that, in this paper, we are presenting a
preliminary version of the corpus for which the tagging guidelines
directed to annotators have been revised multiple times. We plan
to make these guidelines stable and publicly available in the near
future, once the corpus is finalised. At that stage, we also intend
to measure the inter-annotator agreement in order to assess the
quality of the deployed data set.
4</p>
    </sec>
    <sec id="sec-6">
      <title>CHALLENGES</title>
      <p>In this section, we describe the challenges that we envision when
aiming to develop an automatic system for the assessment of
compliance of privacy policies according to the GDPR. All examples
have been extracted from the Airbnb Privacy Policy document, last
updated 16 April 2018.
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>Context</title>
      <p>One of the earliest challenges encountered in the automated
detection of problematic clauses in privacy policies is the fact that
the examination of single sentences is insuficient for the
determination of their defectiveness within the three dimensions. For
this purpose we need to link several sentences. Conversely, our
previous experiments showed that the analysis of single sentences
is adequate to identify unlawful or unfair clause in terms of services.
For instance, consider the following example taken from the Airbnb
privacy policy.</p>
      <p>[Line 80] 2.2 Create and Maintain a Trusted
and Safer Environment. Detect and prevent
fraud, spam, abuse, security incidents, and
other harmful activity.</p>
      <p>Conduct security investigations and risk
assessments.</p>
      <p>Verify or authenticate information or
identifications provided by you (such as to
verify your Accommodation address or compare
your identification photo to another photo
you provide).</p>
      <p>Conduct checks against databases and other
information sources, including background
or police checks, to the extent permitted
by applicable laws and with your consent
where required.</p>
      <p>Comply with our legal obligations.</p>
      <p>Resolve any disputes with any of our
Members and enforce our agreements with third
parties.</p>
      <p>Enforce our Terms of Service and other
policies.</p>
      <p>In connection with the activities above, we
may conduct profiling based on your
interactions with the Airbnb Platform, your
profile information and other content you
submit to the Airbnb Platform, and information
obtained from third parties. In limited
cases, automated processes may restrict or
suspend access to the Airbnb Platform if
such processes detect a Member or activity
that we think poses a safety or other risk
to the Airbnb Platform, other Members, or
third parties.</p>
      <p>We process this information given our
legitimate interest in protecting the Airbnb
Platform, to measure the adequate
performance of our contract with you, and to
comply with applicable laws.</p>
      <p>As it can be seen, the last sentence taken separately fails to
specify the legitimate interest at stake, the specification there
provided “protecting the Airbnb Platform, to measure the adequate
performance of our contract with you, and to comply with
applicable laws", which is very generic. However, the sentence ofers
an adequate specification when it is read in conjunction with the
preceding list. This means that for the detector to identify
defectiveness of a clause, it should evaluate the whole section, rather
than the individual sentences.
4.2</p>
    </sec>
    <sec id="sec-8">
      <title>Omission of Information</title>
      <p>
        In our previous work [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] on Terms of Service, we used machine
learning and natural language processing techniques for the
detection of (potentially) unfair clauses. In the context of privacy policies
we have diferent goals, which are defined in the Gold Standard
guidelines (see Section 3.1). In particular, our purpose lies not only
in detecting the unfairness, and the unclear language,3 but also in
checking whether certain information is present and suficient in
view of the regulatory framework.
      </p>
      <p>The latter is conceptually a completely diferent task for two
main reasons: (i) we aim to identify the presence of a sentence,
rather than the fact that its content is not compliant with the law,
and (ii) we need to verify whether some information is suficient ,
or not, with respect to the Gold Standard.</p>
      <p>
        In case of Terms of Service, classic NLP approaches, such as
statistical classifiers or neural networks, worked quite well since
the detection of unfair clauses can be easily framed as a sentence
classification problem, where (potential) unfairness is clearly
deifned and statistics collected from a wide corpus can be suficient
to identify target clauses. In contrast, in the privacy policy analysis
our goal is not pure detection of content, since it also involves the
capability to spot some missing, hidden, or insuficient information.
For humans, this problem is typically addressed with a number of
reasoning steps. Therefore, we argue that more sophisticated
artificial intelligence approaches are needed, for example coming from
the neural-symbolic community [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], or from the neural
architectures that have been specifically developed to deal with reasoning
tasks [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Another path for development could be explored by
adding contextual information to the classifier. For instance, when
classifying a single sentence, taking into account also the
information regarding surrounding sentences, or even the whole document,
could in fact provide crucial information for a correct classification
of the clause.
      </p>
      <p>As an example of the complexity of such a task, we hereby report
some clauses related to the purpose of processing (&lt;purp&gt;) within
the comprehensiveness dimension. Following the GDPR, the data
controller is required to provide clear information on the purposes
3The detection of unclear language is also per se a slightly diferent task, as it moves
the attention towards a purely linguistic perspective.
as to why data are collected and how such data will be used. These
processes should be transparent and within the limits prescribed in
articles 13(1)(c) and 14(1)(c). To assess whether the privacy policy
is compliant in this regard, we distinguish between optimal (fully
informative) and sub-optimal (missing some information) clauses.</p>
      <p>For example the following clause satisfies the criteria since it
provides an exhaustive list of the purposes for data processing.
&lt;purp1&gt;If you are a Host, the Payments Data
Controller may require identity
verification information (such as images of your
government issued ID, passport, national ID
card, or driving license) or other
authentication
information, your date of birth, your
address, email address, phone number and other
information in order to verify your
identity, provide the Payment Services to you,
and to comply with applicable law.&lt;/purp1&gt;</p>
      <p>In contrast, clauses that use vague language and only give general
examples are considered problematic, since they can be interpreted
to justify the use of personal data beyond what the consumer might
have intended when consenting to the policy. It raises concerns
around informed consent. Consider, for instance the following
example from the Airbnb Privacy Policy.</p>
      <p>&lt;purp2&gt;We may use your personal data to
develop new services&lt;/purp2&gt;
4.3</p>
    </sec>
    <sec id="sec-9">
      <title>Multilingualism</title>
      <p>Considering that the GDPR governs data processing in all European
Union states, it is important to take into account its 24 oficial
languages. Linguistic diversity and equal legal status between the
diferent European languages are among the core values in access
to justice in the EU. Therefore, when ofering any solution aimed at
informing and protecting consumers, researchers should also design
its methodology to preserve the original functions and accuracy
across these many diferent languages. This task is particularly
relevant for NGOs and consumer organisations that very often
struggle with the diversity of language and the comparison of
diferent versions of the same documents.</p>
      <p>In our project, we have chosen English as the base language, and
have started experimenting with transfer of tags from annotated
documents in English to privacy policies in German. This process
involves the use of three types of documents: (1) the original,
annotated text in English, (2) the original text in German, and (3) the
automatic translation of the original English text into German.</p>
      <p>Consider, for instance, the following examples of original,
annotated clauses in English. The first clause pertains to the period
for which the personal data will be stored. It has been marked as
&lt;ret2&gt;, i.e. insuficiently informative, since it does not clearly
deifne the retention period of the personal data. The second clause
pertains to both the data retention and the categories of data
collected. It has been marked as insuficient since the retention period
and the categories of personal data are not defined, as indicated by
the expressions ‘reasonable measures’ and ‘when it is no longer
required’.
[ENGLISH] &lt;ret2&gt;We may retain information
as required or permitted by applicable laws
and regulations, including to honor your
choices, for our billing or records
purposes and to fulfill the purposes described
in this Privacy Statement.&lt;/ret2&gt;
&lt;ret2&gt;&lt;cat2&gt;We take reasonable measures to
destroy or de-identify personal information
in a secure manner when it is no longer
required.&lt;/cat2&gt;&lt;/ret2&gt;</p>
      <p>Let us now consider the corresponding clauses in German as
translated and marked.</p>
      <p>[GERMAN] &lt;ret2&gt;Wir können Informationen, wie
gemäß geltenden Gesetzen und Bestimmungen
erforderlich oder zugelassen, einschließlich
unter Einbeziehung ihrer Auswahl, zu zwecken
der Rechnungstellung oder Buchführung und
um den zwecken dieser Datenschutz Erklärung
nachzukommen, speichern.&lt;/ret2&gt;
&lt;cat2&gt;&lt;ret2&gt;Wir ergreifen angemessene
Maßnahmen, um personenbezogene Daten auf eine
sichere Weise zu zerstören oder unkenntlich
zu machen, wenn diese nicht länger
erforderlich sind.&lt;/ret2&gt;&lt;/cat2&gt;</p>
      <p>In this test case, the machine translation reference file was
generated in an accurate manner and the tags were successfully
transferred, given that the English and German language versions did
not bear discrepancies in the clauses used.</p>
      <p>
        Clearly, there would be major challenges involved with
transferring tags in cases where the text in English is diefrent from the text
in target language, not only in terms of syntax, but also regarding
the legal obligations that might be unique to a certain jurisdiction.
Moreover, English is by far the most widely studied language in
natural language processing, thus the existing resources in other
languages are often not as accurate or rich as those developed
for English. Nevertheless, a lot of efort in artificial intelligence
is currently being dedicated to tools and platforms dealing with
multilingualism (e.g., see [
        <xref ref-type="bibr" rid="ref15 ref2">2, 15</xref>
        ] and references therein).
      </p>
    </sec>
    <sec id="sec-10">
      <title>5 EXPERIMENTS</title>
      <p>In this section we present some preliminary results, based on the
data set of 32 annotated privacy policies, as described in Section 3.2.
We focus on the task of sentence detection only, leaving to future
work the challenges related to multilingualism.</p>
      <p>
        In particular, in our experimental evaluation we used SVMHMM,
a machine learning approach that combines Support Vector
Machines (SVM) and Hidden Markov Models (HMM) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], and which
enables to collectively classify all the sentences in a document, thus
taking into account the order of the examples. We started with
a very basic set of features, namely the bag-of-words (unigrams
and bigrams) describing each sentence, leaving to future research a
deeper investigation of richer feature sets, possibly exploiting deep
learning in order to directly learn sentence representations.
      </p>
      <p>In all the experiments we used the leave-one-document-out
(LOO) procedure, where each document is used, in turn, as the
test set, and all the remaining are merged into the training set. We
consider the following performance measures: (i) precision P , that
is the fraction of sentences predicted as positive, which are actually
positive; (ii) recall R, that is the fraction of positive sentences that
are correctly detected; (iii) F -measure F1, that is the harmonic mean
between P and R. For each measure, we report the macro-average,
that is the average computed over the measures obtained for each
single document.</p>
      <p>We consider the tasks of detecting the clauses concerning the
purpose of processing (thus considering the union of &lt;purp1&gt; and
&lt;purp2&gt; as the positive class), those problematic or unfair related
to advertising (with the union of &lt;ad2&gt; and &lt;ad3&gt; as the positive
class), and finally those that contain unclear language (the &lt;vag&gt;
tag only). Results are reported in Table 1. To highlight the dificulty
of the task, we compare the results achieved by SVMHMM against
two trivial baselines: a random classifier, which predicts the positive
class accordingly to class distribution, and a second system that
always predicts the positive class. SVMHMM achieves a value of
F1 equal to 0.552 for the detection of clauses regarding the purpose
of processing (against 0.126 and 0.221 of the two baselines,
respectively) and 0.421 for advertising (against 0.034 and 0.061 for the
two baselines, respectively). A similar trend is shown for unclear
language, which achieves F1 equal to 0.460. The very low values
of the baselines, as well as the confusion matrices reported in
Table 2, clearly show the large imbalance between the positive and
negative classes: for example, only 3% of sentences are annotated
as either &lt;ad2&gt; or &lt;ad3&gt;. This imbalance makes all the considered
tasks particularly challenging. Therefore the F1 values obtained in
the range 0.42 – 0.55 can be considered as encouraging.</p>
      <p>In addition, we also want to note that the results are very
heterogeneous across diferent documents. For example, for the &lt;ad&gt;
tag, for the Dropbox and Courchsurfing policies, the SVMHMM
approach achieves F1 equal to 0.86 and 0.89, respectively, whereas
the Crowtangle policy is even perfectly predicted, with three
positive clauses correctly predicted with no false positive. We plan to
deeply analyse and discuss further these more fine-grained results
once our final corpus will be released.
6</p>
    </sec>
    <sec id="sec-11">
      <title>DISCUSSION AND FUTURE WORK</title>
      <p>Considering the number of independent research projects working
in this area, an identification of the current problems aims to
establish a common ground for fruitful discussions of the future work. In
this paper, we have presented a work in progress of a methodology
(the Gold Standard) for annotating post-GDPR privacy policies to
identify and assess the compliance with the regulation. We have
identified three challenges that should be addressed to progress
in assessing the privacy policies with NLP and ML tools. While
we have made some progress in each of the identified areas, there
remains a lot of work to reach the overall objectives of the project.</p>
      <p>The first challenge concerns the fact that the privacy policies are
written in a language that tends to be more broad in its possible
interpretations, and it is not uncommon to define the meaning of
certain terms early in the document and use such terms without
direct references back to the original definitions. Such references
can be both internal and external, increasing the complexity for
comprehension of the consumer’s rights and duties based on the
signed agreement. Since our project aims at providing consumers
with a tool that would facilitate an increased understanding of the
privacy policies, it is essential that the automated evaluation of
clauses is able to build context for such an understanding.</p>
      <p>The second challenge focused on the omission of information,
which requires both the knowledge of what information should
be included in the document and a way to identify the absence of
the required information. Such a task requires exploring methods
beyond pure text mining approaches.</p>
      <p>Lastly, we looked at the need to consider an approach that is
able to use the results achieved in working with privacy policies in
English and transfer the annotations to diferent language versions
without losing the accuracy and eficiency.</p>
      <p>In sum, with ever more scientific research going open-access, the
need for clear and transparent annotation guidelines and shared
corpora is increasingly pressing. As part of our future work, we
aim to publish the annotated privacy policy corpora online, as we
have done with the Terms of Service agreements. Future work also
includes moving beyond pure language processing and introducing
a level of reasoning that allows context comprehension by machines.
We maintain our overall objective to design a methodology and
provide a tool for consumers and NGOs that would empower them
through more informed decision making in the digital environment.
7</p>
    </sec>
    <sec id="sec-12">
      <title>ACKNOWLEDGEMENTS</title>
      <p>We would like to thank all the members of the Project Claudette
and our funding authorities at the European University Institute
Research Council, Bureau Européen des Unions de Consommateurs,
and the Zeppelin Universität.</p>
      <sec id="sec-12-1">
        <title>True</title>
        <p>0
1
True
0
1</p>
      </sec>
      <sec id="sec-12-2">
        <title>True</title>
        <p>0
1</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Lisa</surname>
            <given-names>M Austin</given-names>
          </string-name>
          , David Lie, Peter Yi Ping Sun, Robin Spillette, Michelle Wong, and
          <string-name>
            <surname>Mariana D'Angelo</surname>
          </string-name>
          .
          <article-title>Towards dynamic transparency: The apptrans (transparency for android applications) project</article-title>
          . http://dx.doi.org/10.2139/ssrn.3203601,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Dzmitry</given-names>
            <surname>Bahdanau</surname>
          </string-name>
          , Kyunghyun Cho, and
          <string-name>
            <surname>Yoshua Bengio.</surname>
          </string-name>
          <article-title>Neural machine translation by jointly learning to align and translate</article-title>
          .
          <source>arXiv preprint arXiv:1409.0473</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Bakos</surname>
          </string-name>
          , Florencia Marotta-Wurgler, and David R Trossen.
          <article-title>Does anyone read the fine print? consumer attention to standard-form contracts</article-title>
          .
          <source>The Journal of Legal Studies</source>
          ,
          <volume>43</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>35</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Cesare</given-names>
            <surname>Bartolini</surname>
          </string-name>
          , Gabriele Lenzini, and
          <string-name>
            <given-names>Cristiana</given-names>
            <surname>Santos</surname>
          </string-name>
          .
          <article-title>A legal validation of a formal representation of gdpr articles</article-title>
          .
          <source>In CEUR Workshop Proceedings:</source>
          , http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2309</volume>
          /10.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Shmuel</surname>
            <given-names>I Becher</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Uri</given-names>
            <surname>Benoliel</surname>
          </string-name>
          .
          <article-title>Law in books and law in action: The readability of privacy policies and the gdpr</article-title>
          .
          <source>CONSUMER LAW &amp; ECONOMICS</source>
          ,
          <string-name>
            <surname>Klaus</surname>
            <given-names>Mathis</given-names>
          </string-name>
          &amp; Avishalom Tor, eds., Springer (forthcoming,
          <year>2019</year>
          ),
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Giuseppe</given-names>
            <surname>Contissa</surname>
          </string-name>
          , Koen Docter, Francesca Lagioia, Marco Lippi,
          <string-name>
            <surname>Hans-W Micklitz</surname>
            , Przemysław Pałka, Giovanni Sartor, and
            <given-names>Paolo</given-names>
          </string-name>
          <string-name>
            <surname>Torroni</surname>
          </string-name>
          .
          <article-title>Claudette meets gdpr: Automating the evaluation of privacy policies using artificial intelligence</article-title>
          . https://ssrn.com/abstract=3208596,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Elisa</given-names>
            <surname>Costante</surname>
          </string-name>
          , Yuanhao Sun, Milan Petković, and
          <article-title>Jerry den Hartog. A machine learning solution to assess privacy policy completeness:(short paper)</article-title>
          .
          <source>In Proceedings of the 2012 ACM workshop on Privacy in the electronic society</source>
          , pages
          <fpage>91</fpage>
          -
          <lpage>96</lpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Mauro</given-names>
            <surname>Dragoni</surname>
          </string-name>
          , Serena Villata,
          <string-name>
            <given-names>Williams</given-names>
            <surname>Rizzi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Guido</given-names>
            <surname>Governatori</surname>
          </string-name>
          .
          <article-title>Combining nlp approaches for rule extraction from legal documents</article-title>
          .
          <source>In 1st Workshop on MIning and REasoning with Legal texts (MIREL</source>
          <year>2016</year>
          ),
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Artur d'Avila Garcez</surname>
            , Tarek R Besold, Luc De Raedt,
            <given-names>Peter</given-names>
          </string-name>
          <string-name>
            <surname>Földiak</surname>
          </string-name>
          , Pascal Hitzler, Thomas Icard,
          <string-name>
            <surname>Kai-Uwe</surname>
            <given-names>Kühnberger</given-names>
          </string-name>
          , Luis C Lamb,
          <string-name>
            <surname>Risto Miikkulainen</surname>
          </string-name>
          , and Daniel L Silver.
          <article-title>Neural-symbolic learning and reasoning: contributions and challenges</article-title>
          .
          <source>In 2015 AAAI Spring Symposium Series</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Mustafa</given-names>
            <surname>Hashmi</surname>
          </string-name>
          .
          <article-title>A methodology for extracting legal norms from regulatory documents</article-title>
          .
          <source>In 2015 IEEE 19th International Enterprise Distributed Object Computing Workshop</source>
          , pages
          <fpage>41</fpage>
          -
          <lpage>50</lpage>
          . IEEE,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Herbert</given-names>
            <surname>Jaeger</surname>
          </string-name>
          .
          <source>Artificial intelligence: Deep neural reasoning. Nature</source>
          ,
          <volume>538</volume>
          (
          <issue>7626</issue>
          ):
          <fpage>467</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Marco</surname>
            <given-names>Lippi</given-names>
          </string-name>
          , Przemysław Pałka, Giuseppe Contissa, Francesca Lagioia, HansWolfgang Micklitz, Giovanni Sartor, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Torroni</surname>
          </string-name>
          .
          <article-title>Claudette: an automated detector of potentially unfair clauses in online terms of service</article-title>
          .
          <source>Artificial Intelligence and Law</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Aleecia</surname>
            <given-names>M McDonald</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lorrie Faith Cranor.</surname>
          </string-name>
          <article-title>The cost of reading privacy policies</article-title>
          .
          <source>ISJLP</source>
          ,
          <volume>4</volume>
          :
          <fpage>543</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>George</surname>
            <given-names>R Milne</given-names>
          </string-name>
          , Mary J Culnan, and
          <string-name>
            <given-names>Henry</given-names>
            <surname>Greene</surname>
          </string-name>
          .
          <article-title>A longitudinal assessment of online privacy notice readability</article-title>
          .
          <source>Journal of Public Policy &amp; Marketing</source>
          ,
          <volume>25</volume>
          (
          <issue>2</issue>
          ):
          <fpage>238</fpage>
          -
          <lpage>249</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Navigli</surname>
          </string-name>
          and
          <article-title>Simone Paolo Ponzetto</article-title>
          .
          <article-title>Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>193</volume>
          :
          <fpage>217</fpage>
          -
          <lpage>250</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Jonathan</surname>
            <given-names>A</given-names>
          </string-name>
          <string-name>
            <surname>Obar and Anne</surname>
          </string-name>
          Oeldorf-Hirsch.
          <article-title>The biggest lie on the internet: Ignoring the privacy policies and terms of service policies of social networking services</article-title>
          .
          <source>Information, Communication &amp; Society</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Monica</surname>
            <given-names>Palmirani</given-names>
          </string-name>
          , Michele Martoni, Arianna Rossi, Cesare Bartolini, and
          <string-name>
            <given-names>Livio</given-names>
            <surname>Robaldo</surname>
          </string-name>
          . Pronto:
          <article-title>Privacy ontology for legal reasoning</article-title>
          .
          <source>In International Conference on Electronic Government and the Information Systems Perspective</source>
          , pages
          <fpage>139</fpage>
          -
          <lpage>152</lpage>
          . Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Joel</surname>
            <given-names>R Reidenberg</given-names>
          </string-name>
          , Jaspreet Bhatia,
          <string-name>
            <surname>Travis D Breaux</surname>
          </string-name>
          , and Thomas B Norton.
          <article-title>Ambiguity in privacy policies and the impact of regulation</article-title>
          .
          <source>The Journal of Legal Studies</source>
          ,
          <volume>45</volume>
          (
          <issue>S2</issue>
          ):
          <fpage>S163</fpage>
          -
          <lpage>S190</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Welderufael</surname>
            <given-names>B Tesfay</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peter Hofmann</surname>
            , Toru Nakamura, Shinsaku Kiyomoto, and
            <given-names>Jetzabel</given-names>
          </string-name>
          <string-name>
            <surname>Serna</surname>
          </string-name>
          .
          <article-title>I read but don't agree: Privacy policy benchmarking using machine learning and the eu gdpr</article-title>
          .
          <source>In Companion of the The Web Conference 2018 on The Web Conference</source>
          <year>2018</year>
          , pages
          <fpage>163</fpage>
          -
          <lpage>166</lpage>
          . International World Wide Web Conferences Steering Committee,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Ioannis</surname>
            <given-names>Tsochantaridis</given-names>
          </string-name>
          , Thomas Hofmann, Thorsten Joachims, and
          <string-name>
            <given-names>Yasemin</given-names>
            <surname>Altun</surname>
          </string-name>
          .
          <article-title>Support vector machine learning for interdependent and structured output spaces</article-title>
          .
          <source>In Proceedings of the twenty-first international conference on Machine learning, page 104. ACM</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>