International Workshop on Legal Data Analytics and Mining (LeDAM 2018): Preface to the Proceedings Arindam Pal∗ Arnab Bhattacharya§ Indrajit Bhattacharya∗ Kripabandhu Ghosh§ Lipika Dey∗ Marie-Francine Moens† Saptarshi Ghosh‡ Tata Consultancy Services Research, India∗ Indian Institute of Technology Kanpur, India§ Katholieke Universiteit Leuven, Belgium† Indian Institute of Technology Kharagpur, India‡ 1 INTRODUCTION 2 DETAILS OF INVITED TALKS Legal data mining is the subarea of data mining applied to legal texts, The LeDAM 2018 workshop included the following invited talks. such as legislation, case law, patents, and scholarly works. Legal data mining systems are important to provide easier access to and • Speaker: Giovanni Sartor, Professor of Legal Informatics insights about law for both common persons and legal profession- and Legal Theory, European University Institute, Italy als. This area is becoming increasingly important, because of the Title: Using Machine Learning to Support Law Enforce- rapidly growing volume of legal cases and documents available in ment to the Benefit of Consumers and Data Subject: the digital formats. For this reason, we organized the First International CLAUDETTE Project Workshop on Legal Data Analytics and Mining (LeDAM 2018), Abstract: The project CLAUDETTE aims to support the de- co-located with ACM CIKM 2018. The website of LeDAM 2018 is tection of potentially unfair and unlawful clause, both in con- https://sites.google.com/site/legaldam2018/. The objectives of the sumer contacts and in privacy policies, through automated LeDAM 2018 workshop are to: (1) Provide a venue for academic tools, based on computational linguistic and artificial intelli- and industrial/governmental researchers and professionals to come gence. The purpose is to enable consumer protection bodies together, present and discuss research results, use cases, innovative and data protection authorities to engage more proactively ideas, challenges, and opportunities that arise from applications and effectively in monitoring compliance and in enforcing of data mining in the legal domain, and (2) Foster collaborations the law. With regard to both contract terms and privacy pol- between the Legal and the Artificial Intelligence, Data Mining, In- icy we have collected a corpus of contract terms, identified formation Retrieval, and Machine Learning communities. different kinds of unlawful and unfair terms through legal The workshop programme included invited talks by the follow- analysis, and annotated the documents accordingly. Then we ing reputed researchers (see Section 2 for details): have applied and tested different computational approaches, • Giovanni Sartor, Professor of Legal Informatics and Legal including various machine learning algorithms, to detect Theory, European University Institute, Italy such terms. The better performing algorithms have been im- • Luigi Di Caro, Assistant Professor, Department of Computer plemented in an application available to the public through Science, University of Turin, Italy the project’s web site. The system is complemented by a • Jack G. Conrad, Lead Research Scientist, Center for AI and crawler, that detects changes in the contract and policies Cognitive Computing, Thomson Reuters Labs, USA already submitted to the system. The program also included presentation of papers accepted through • Speaker: Luigi Di Caro, Assistant Professor, Department of the peer-reviewed track (see Section 3), and a panel discussion on Computer Science, University of Turin, Italy emerging problems in legal data mining. We specifically attempted Title: Natural Language Processing and Ontology Learning to ensure the presence of both academicians from the data min- in the Legal Domain ing/IR/ML communities as well as practitioners from the Law in- Abstract: Legal ontologies aim to provide a structured repre- dustry among our invited speakers and members of our Program sentation of legal concepts and their interconnections. These Committee (stated in Section 3). For further details, refer to the ontologies are then exploited to support tasks such as in- LeDAM 2018 website https://sites.google.com/site/legaldam2018/. formation extraction and question answering in the legal domain. Given the increasing importance of the Web of Data in public administration and in companies, being able to Copyright © CIKM 2018 for the individual papers by the papers' provide machine-readable legal information is becoming a authors. Copyright © CIKM 2018 for the volume as a collection by its editors. This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0). valuable and desired contribution. However, concepts and re- • Karl Branting, MITRE Corporation, USA lations within existing ontologies usually represent limited • Katie Atkinson, University of Liverpool, UK subjective and application-oriented views of specific sub- • Ken Satoh, National Institute of Informatics, Japan domains of interest. The talk will discuss resent research on • Kevin Ashley, University of Pittsburgh, USA natural language technologies and text mining approaches • Matthias Grabmair, Carnegie Mellon University, USA towards the creation, the reuse and the enrichment of legal • Maura Grossman, University of Waterloo, Canada ontologies. • Mi-Young Kim, University of Alberta, Canada • Mossab Bagdouri, Walmart Labs, USA • Speaker: Jack G. Conrad, Lead Research Scientist, Center • Paulo Quaresma, Universidade de Evora, Portugal for AI and Cognitive Computing, Thomson Reuters, USA • Prasenjit Majumder, DAIICT, India Title: 30 Years of AI and Law: Legal Data Analytics in the • William Webber, William Webber Consulting, Australia Long View – Looking Back, Looking Forward Five papers were accepted through the peer-review process. The Abstract: This talk will begin by examining the roots of Arti- papers were on various topics, including contract renewals, con- ficial Intelligence and Law – including applications involving cept hierarchy extraction, patent clustering, argumentation-driven NLP, data mining, machine learning, and more broadly, data information extraction, deep ensemble learning. The list of papers analytics – noting that it has been around for much longer accepted in LeDAM 2018 is as follows. than the recent buzz would suggest. We will explore the field • Title: Structural Analysis of Contract Renewals of AI and Law in terms of its development and expansion Authors: Frieda Josi and Christian Wartena starting in the 1980s and study how seminal research was conducted and reported on in conference proceedings such • Title: Concept Hierarchy Extraction from Legal Literature as ICAIL and publications such as the AI and Law journal. Authors: Sabine Wehnert, David Broneske, Stefan Langer After having established the foundations of today’s field of and Gunter Saake AI and Law, we will look to the future and sketch some of the practical application scenarios that the capabilities from • Title: Use of Pseudo Relevance Feedback for Patent Cluster- the field promise to deliver. These include next-generation ing with Fuzzy C-means tools for legal professionals that can augment their skill sets Authors: Noushin Fadaei and Thomas Mandl by providing analytical abilities to help in the crafting of legal strategies. We will illustrate such instruments through • Title: Argumentation-driven information extraction for on- the visualization of expected outcomes, while varying key line crime reports parameters such as trial length, expected costs, and likely Authors: Marijn Schraagen, Bas Testerink, Daphne Odek- award or settlement figures. Lastly, we will investigate the erken and Floris Bex prospective role that prediction tools can play in AI and Law application spaces, while looking still further into the future. • Title: Deep Ensemble Learning for Legal Query Understand- ing 3 PEER-REVIEWED PAPER TRACK Authors: Arunprasath Shankar and Venkata Nagaraju Bud- Eight papers were submitted to the peer-review track, from diverse darapu countries all over the world. Each submitted paper was reviewed by at least three members of the following Program Committee: • Adam Wyner, Swansea University, Swansea, UK 4 ACKNOWLEDGEMENTS • Charles K. Nicholas, University of Maryland Baltimore County, We are grateful to the CIKM 2018 workshop chairs Francesco Bonchi USA and Dimitris Gunopulos for their help and support. We are thankful • Dave Lewis, Brainspace - A Cyxtera Business, USA to all the authors for submitting their papers to our workshop. We • Girish Keshav Palshikar, Tata Consultancy Services, India thank the PC members for carefully reviewing the papers. Last, • Haozhen Zhao, Legal Technology Solution Practice, Navi- but not the least, we are grateful to Paheli Bhattacharya for being gant the web chair (along with Kripabandhu Ghosh) and keeping the • Jack G. Conrad, Thomson Reuters, USA website running and up-to-date. • Jeroen Keppens, King’s College London, UK