Explaining potentially unfair clauses to the consumer with the CLAUDETTE tool Rūta Liepin, a Federico Ruggeri Francesca Lagioia Faculty of Law, Maastricht University DISI, University of Bologna CIRSFID, University of Bologna and and Law Department, EUI Law Department, EUI Marco Lippi Kasper Drazewski Paolo Torroni DISMI, University of Modena and BEUC DISI, University of Bologna Reggio Emilia ABSTRACT need for explanations and illustrate ways in which such explana- This paper presents the latest developments of the use of memory tions can be automatically generated. Sections 3 and 4 present the network models in detecting and explaining unfair terms in on- extended knowledge base of legal rationales and methods behind line consumer contracts. We extend the CLAUDETTE tool for the such integration. Section 5 demonstrates the new features of the detection of potentially unfair clauses in online Terms of Service, tool and examples of what information is provided to the consumers by providing to the users the explanations of unfairness (legal ra- when inquiring about the fairness of their contractual terms. tionales) for five different categories: arbitration, unilateral change, content removal, unilateral termination, and limitation of liability. 2 THE NEED FOR EXPLANATIONS The need for explainable results by AI systems has been a viral KEYWORDS topic in the regulatory territory [2] and has provoked the interest Memory Networks, Terms of Service, NLP of many scholars [1, 3, 7, 9, 15, 19]. Main themes of this research include interpretability of results produced by AI systems, trans- ACM Reference Format: Rūta Liepin, a, Federico Ruggeri, Francesca Lagioia, Marco Lippi, Kasper parency of the workings of such systems, and the relationship Drazewski, and Paolo Torroni. 2020. Explaining potentially unfair clauses to between explainability and trust of the end-users. In the context the consumer with the CLAUDETTE tool. In Proceedings of the 2020 Natural of consumer contracts, lack of clear explanations of user rights in Legal Language Processing (NLLP) Workshop, 24 August 2020, San Diego, US. the terms and conditions has resulted in uninformed consent and , 4 pages. truth obstruction by the companies [20]. To remedy the informa- tion imbalance, we designed CLAUDETTE, a tool for the automatic 1 INTRODUCTION detection of potentially unfair clauses in contracts [11]. However, Online market practices continuously display power asymmetry to- further explanations of detected clauses were not available to the wards consumers [12, 23]. Several technical solutions have emerged users. [5, 17, 18], but the focus has largely been on identifying clauses that One method to integrate domain knowledge in machine learning might be of interest to consumers, in that way navigating the reader classifiers that has been explored in the AI community is the end- through the extensively long agreements [16]. However, the lack of to-end memory network model [21, 22], which allows to perform context and explanation of such clauses, as well as limited enforce- classification by exploiting an additional, external memory of know- ment possibilities, have hindered the desired goals in consumer ledge. Within this memory we stored a collection of legal rationales protection. provided by legal experts. In consumer contracts, in fact, unfair While there seems to be an agreement that most Terms of Service clauses are linked with legal rationales. The feature of providing the (ToS) agreements contain clearly or potentially unfair clauses [13, user with rationales of why the particular clause can be considered 23], it may be insufficient to know which clauses are unfair without unfair is seen as an important development of the tool for effective providing context for the consumer [9]. Moreover, for such ex- empowerment of consumers [9, 10, 14, 15]. planations to eventually lead to effective protection, they must be grounded in the current legal framework in the European Union, 3 KNOWLEDGE BASE: LEGAL RATIONALES i.e. The Unfair Contract Terms Directive 93/13/EEC (the Directive). OF UNFAIRNESS In this paper we present one possible solution to increase con- The original training set for the classification tasks included 100 sumer empowerment through technology based on memory net- ToS agreements from the most popular online companies that were works. Following earlier studies [8, 11], we have introduced the use double-labelled by legal experts, according to the criteria described of legal rationales as explanations of clause unfairness within the in [11]. In addition to comprehensive annotation guidelines based updated CLAUDETTE tool. In Section 2 the paper will explore the on the Directive, its annex with a list of sample clauses which can Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons be described as unfair, and Court of Justice of the European Union License Attribution 4.0 International (CC BY 4.0). decisions, the project also relied on the individual legal expertise NLLP @ KDD 2020, August 24th, San Diego, US and previous experience of the annotators, e.g., in understanding © 2020 . and applying the relevant legal instruments. Given the legal frame- work, the project focuses on unfair terms as defined in the European NLLP @ KDD 2020, August 24th, San Diego, US Liepina, Ruggeri, et al Union. Encoding of this expert knowledge such that it provides For further illustration, consider the clause from the Oculus ToS, benefit for a consumer is a challenging task. In the previous version which has been detected as (potentially) unfair for the unilateral of the CLAUDETTE tool, users could copy and paste their service change category: agreements into a text-box and the system automatically detected “We may update or revise these warnings and instruc- potentially unfair clauses based on nine unfairness categories.1 tions, so please review them periodically.” Creating a knowledge base for the detected clauses is a slightly different task. At this stage, we have chosen five unfairness categor- Detection of unfairness in this context can be explained by two ies: limitation of liability (), unilateral change (), unilateral rationales: termination (), content removal (), and arbitration (). [anyreason]: since the clause states that the provider has The knowledge base consists of the rationales and their unique the right for unilateral change of the contract/services/goods/ identifiers that are linked to the unfairness categories. In particular, features for any reason at its full discretion, at any time the following distribution of rationales was created based on the [justposted]: since the clause states that the provider has information patterns in the online contracts: (18), (17), the right for unilateral change of the contract/services/goods/ (28), (8), (8). Note that a single potentially unfair features where the notification of changes is left at a full dis- clause can be linked with different explanations. cretion of the provider, i.e. by simply posting the new terms Consider the following clause taken from the Goodreads ToS on their website, with or without a direct notification to the and classified as (potentially) unfair under unilateral termination: consumer “Goodreads may permanently or temporarily termin- Similar to the previous example, this company has used a gen- ate, suspend, or otherwise refuse to permit your ac- eral statement to claim full discretion in updating their terms and cess to the Service without notice and liability for any conditions. Additionally, they have also limited the notification reason, including if in Goodreads’ sole determination procedure to only posting the updates online with no further cla- you violate any provision of this Agreement, or for rifications on whether and how the consumer would be informed. no reason.” Future work of this project includes investigation of these types of It has been associated to the following three rationales: legal rationales that are linked to different types of market sectors. [any_reason]: since the clause generally states the contract or access may be terminated for any reason, without cause 4 METHOD or leaves room for other reasons which are not specified. The task of unfair clause detection in consumer contracts is for- [breach]: since the contract or access can be terminated mulated as a binary classification problem, in which the model where the user fails to adhere to its terms, or community has also access to an external knowledge base containing legal standards, or the spirit of the ToS or community terms, in- rationales depicting the possible motivations behind a certain type cluding inappropriate behaviour, using cheats or other dis- of unfairness. Formally, an architecture coupling a model with allowed practices to improve their situation in the service, an external supporting memory is known as memory-augmented deriving disallowed profits from the service, or interfering neural network (MANN) [4, 21, 22]. Such a memory brings two with other users’ enjoyment of the service or otherwise puts important benefits to model representational capabilities: (1) the them at risk, or is investigated under any suspicion of mis- memory can act as an auxiliary tool to handle complex reasoning conduct. such as capturing long-term dependencies; (2) the memory can be [no_notice]: since the clause states that the contract or employed to inject external domain knowledge directly into the access may be terminated without notice or simply posting model for different purposes, mainly interpretability, transfer learn- it on the website and/or the trader is not required to observe ing and context conditioning. Our approach is centred on the latter a reasonable period for termination. advantage and extends the first experimental setup of MANN’s Each of the rationales provides an explanation of a different as- for unfairness detection [8] by considering several categories of pect of the given clause. ‘Any reason’ rationale is the most common legal violations. From a technical point of view, the model takes the type of ‘explanation’ that is present in all unfairness categories clause to classify as input, referred as the query 𝑞, and compares it albeit in slightly different shapes. Blanket phrases such as ‘any with each element stored into the memory 𝑀, 𝑚𝑖 , via a (paramet- reason’, ‘no reason’ or ‘full discretion’ are unlikely to pass the con- ric) similarity operation 𝑠 (𝑞, 𝑚𝑖 ). As a result, a set of (normalized) tractual term fairness test under the Directive. Similarly, the ‘no similarity scores 𝑤𝑖 are retrieved and used to aggregate memory Í |𝑀 | notice’ rationale, which cover situations where the consumer is content into a single summary vector 𝑐 = 𝑖=1 𝑤𝑖 · 𝑚𝑖 . Intuitively, expected to regularly check the service online pages to update their this aggregated result can be thought of as a fuzzy representation knowledge about the changing rights and obligations. It can also of the memory 𝑀 conditioned on the given input query 𝑞. Indeed, be argued that a full termination of services based on an alleged we are only interested in retrieving memory content that is useful breach of contract is unfair under the Directive, especially in the to correctly classify the input clause. Lastly, the retrieved memory absence of review mechanisms and/or explanations given to the content is used to enrich (update) the query in order to ease the consumers. classification process. Note that the MANN architecture also allows 1 These include the choice of (i) jurisdiction, (ii) choice of law, (iii) limitation of liability, an iterative interaction with the memory, each time employing (iv) unilateral change, (v) unilateral termination, (vi), arbitration, (vii) contract by using, the previously updated query, suitable for complex reasoning tasks, (viii) content removal, (ix) privacy included. such as reading comprehension [6]. However, the task of unfairness Explaining potentially unfair clauses to the consumer with the CLAUDETTE tool NLLP @ KDD 2020, August 24th, San Diego, US detection allows us to limit to a single iteration approach, since it is clauses have been detected as potentially unfair, as well as showing sufficient to link a single legal rationale to motivate its unfairness. the confidence scores of such explanations. In the future, we aim to test different variants of the MANN model to improve the capability 5 DEMO of the network to exploit the knowledge, as well as to improve the The CLAUDETTE web service built on the aforementioned MANN- user experience of the current extension. based methodology provides an output such as the one depicted We also plan to extend the methodology to privacy policies, in Figure 1.2 In particular, the tool offers the user the possibility which are much more complex documents, for which not only to enter some text to analyse; the input text is then separated into potential unfairness should be checked, but also comprehensiveness sentences, and each of them is classified as either unfair or not. In and compliance to the existing regulations.3 the first case, the system also predicts the unfairness category. For each detected unfair sentence presented in the results web page, REFERENCES CLAUDETTE thus reports the unfairness category and, if any, also [1] Or Biran and Courtenay Cotton. Explanation and justification in machine learn- ing: A survey. In IJCAI-17 workshop on explainable AI (XAI), volume 8, 2017. the list of legal rationales that were employed by the underlying [2] Finale Doshi-Velez, Mason Kortz, Ryan Budish, Chris Bavitz, Sam Gershman, MANN model during classification, each with a corresponding David O’Brien, Stuart Schieber, James Waldo, David Weinberger, and Alexandra confidence score. In this way the user is not only informed about Wood. Accountability of ai under the law: The role of explanation. arXiv preprint arXiv:1711.01134, 2017. the unfairness categories and reasons for unfairness, but also is [3] Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and given an indicator on how relevant these reasons are for the input Lalana Kagal. Explaining explanations: An overview of interpretability of ma- text. chine learning. In 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA), pages 80–89. IEEE, 2018. [4] Alex Graves, Greg Wayne, and Ivo Danihelka. Neural Turing machines. arXiv preprint arXiv:1410.5401, 2014. [5] Hamza Harkous, Kassem Fawaz, Rémi Lebret, Florian Schaub, Kang G Shin, and Karl Aberer. Polisis: Automated analysis and presentation of privacy policies using deep learning. In 27th {USENIX } Security Symposium ( {USENIX } Security 18), pages 531–548, 2018. [6] Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. The Goldilocks principle: Reading children’s books with explicit memory representations. arXiv preprint arXiv:1511.02301, 2015. [7] Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher. Ask me anything: Dynamic memory networks for natural language processing. In International conference on machine learning, pages 1378–1387, 2016. [8] Francesca Lagioia, Federico Ruggeri, Kasper Drazewski, Marco Lippi, Hans- Wolfgang Micklitz, Paolo Torroni, and Giovanni Sartor. Deep learning for detect- ing and explaining unfairness in consumer contracts. In Legal Knowledge and Information Systems: JURIX 2019: The Thirty-second Annual Conference, volume 322, page 43. IOS Press, 2019. [9] Brian Y Lim, Anind K Dey, and Daniel Avrahami. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of Figure 1: Example of classification performed by the the SIGCHI Conference on Human Factors in Computing Systems, pages 2119–2128, CLAUDETTE tool. Unfair sentences are highlighted in bold 2009. and tagged with predicted unfairness label, i.e., category. Ad- [10] Marco Lippi, Giuseppe Contissa, Agnieszka Jablonowska, Francesca Lagioia, Hans-Wolfgang Micklitz, Przemyslaw Palka, Giovanni Sartor, and Paolo Torroni. ditionally, if the memory has been used during classifica- The force awakens: Artificial intelligence for consumer law. J. Artif. Intell. Res., tion, the list of exploited legal rationales along with model 67:169–190, 2020. confidence score (ranging from 0 to 1) is reported. [11] Marco Lippi, Przemysław Pałka, Giuseppe Contissa, Francesca Lagioia, Hans- Wolfgang Micklitz, Giovanni Sartor, and Paolo Torroni. Claudette: an automated detector of potentially unfair clauses in online terms of service. Artificial Intelli- gence and Law, 27(2):117–139, 2019. Another noteworthy benefit of the use of MANN is the improved [12] Marco Loos and Joasia Luzak. Wanted: a bigger stick. on unfair terms in consumer detection rates, especially for unfairness categories that have proved contracts with online service providers. Journal of consumer policy, 39(1):63–90, harder to identify. An example of limited liability clauses explored 2016. [13] Hans-W Micklitz. The proposal on consumer rights and the opportunity for a in [8], showed how memory network improves upon the state of reform of european unfair terms legislation in consumer contracts. 2010. the art support vector machine approach: [14] Bonnie M Muir. Trust in automation: Part i. theoretical issues in the study of trust and human intervention in automated systems. Ergonomics, 37(11):1905–1922, Model Precision Recall F1 1994. [15] Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and State of the art SVM 52.52 81.57 63.7 Finale Doshi-Velez. How do humans understand explanations from machine Memory Network 68.36 84.31 64.33 learning systems? an evaluation of the human-interpretability of explanation. arXiv preprint arXiv:1802.00682, 2018. [16] Jonathan A Obar and Anne Oeldorf-Hirsch. The biggest lie on the internet: 6 CONCLUSIONS Ignoring the privacy policies and terms of service policies of social networking This paper presents an extension of the automated detection of services. Information, Communication & Society, 23(1):128–147, 2020. [17] Przemysław Pałka and Marco Lippi. Big data analytics, online terms of service unfair terms in consumer contracts by adding explanations through and privacy policies. Research Handbook on Big Data Law edited by Roland Vogl, memory network models. It directly addresses the call for more 2020. explainable and transparent AI results and furthers the goal of em- 3 One of the authors, Francesca Lagioia, has been supported by project “CompuLaw", powering consumers by providing legal rationales on why certain funded by the European Research Council (ERC) under the European Union’s Horizon 2 http://claudette.eui.eu/demo/answers/vYgfZetiN2.html 2020 research and innovation programme (Grant Agreement No. 833647). NLLP @ KDD 2020, August 24th, San Diego, US Liepina, Ruggeri, et al [18] Hugo Roy, JC Borchardt, I McGowan, J Stout, and S Azmayesh. Terms of service; [21] Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, et al. End-to-end memory didn’t read. Web Page, June. URL https://tosdr. org, 2012. networks. In Advances in neural information processing systems, pages 2440–2448, [19] Wojciech Samek, Thomas Wiegand, and Klaus-Robert Müller. Explainable ar- 2015. tificial intelligence: Understanding, visualizing and interpreting deep learning [22] Jason Weston, Sumit Chopra, and Antoine Bordes. Memory networks. arXiv models. arXiv preprint arXiv:1708.08296, 2017. preprint arXiv:1410.3916, 2014. [20] Edith G Smit, Guda Van Noort, and Hilde AM Voorveld. Understanding online [23] Chris Willett. Fairness in consumer contracts: The case of unfair terms. Ashgate behavioural advertising: User knowledge, privacy concerns and online coping Publishing, Ltd., 2007. behaviour in europe. Computers in Human Behavior, 32:15–22, 2014.