=Paper= {{Paper |id=Vol-2844/ethics6 |storemode=property |title=From Legal Documents to Legal Document Management Systems; The Case of LegiCrowd (short paper) |pdfUrl=https://ceur-ws.org/Vol-2844/ethics6.pdf |volume=Vol-2844 |authors=Alexandros Nousias,Alain Couillault,Sofia Almpani,Theodoros Mitsikas,Petros Stefaneas |dblpUrl=https://dblp.org/rec/conf/setn/NousiasCAMS20 }} ==From Legal Documents to Legal Document Management Systems; The Case of LegiCrowd (short paper)== https://ceur-ws.org/Vol-2844/ethics6.pdf
          From Legal Documents to Legal Document Management
                     Systems; The Case of LegiCrowd
             Alexandros Nousias                                           Alain Couillault                              Sofia Almpani
     Future Now Business Consultants &                         Association des Professionnels des              National Technical University of
         Training / MyData Greece)                               Industries de la langue (APIL)                            Athens
               Athens, Greece                                           Montreuil, France                             Zografou, Greece
       alexandros.nousias@gmail.com                              alain.couillault@apoliade.com                     salmpani@mail.ntua.gr

                                          Theodoros Mitsikas                                   Petros Stefaneas
                                    National Technical University of                    National Technical University of
                                                Athens                                              Athens
                                           Zografou, Greece                                    Zografou, Greece
                                       mitsikas@central.ntua.gr                              petros@math.ntua.gr

ABSTRACT                                                                               from the basics revisiting the concept, role, and specs of terms of
In this position paper, we argue that users’ online consents to terms                  services and privacy policies as agents of information provision
of services and privacy notices is naturally impaired by the unbal-                    towards systemic, human centric, and human friendly automation.
anced powers between online service providers and their users. We                      Terms of service and privacy policies are deemed raw data for au-
argue that a full fledged legal document management system rely-                       tomated meaning extractions via relevant information retrieval,
ing on semantic representation is key to resolving this conflict and                   question answering, dialogue systems, and other Natural Language
facilitating transparency of Online Legal Documents, and we give a                     Processing applications.
quick overview of the LegiCrowd project, a crowdsourced approach                          The rest of the paper is organised as follows: In Section 2 we
to legal documents annotation, which paves the way towards such                        provide a brief description of the information technology advance-
solution.                                                                              ments to date and key characteristics thereof. Section 3 discusses
                                                                                       inconsistencies and loopholes of the modern legal design. This Sec-
1    INTRODUCTION                                                                      tion also expands on that ground we argue that legal representation
                                                                                       and modelling could be the solution for a radical update of the
As AI technology and automation permeate society horizontally, the
                                                                                       modern legal properties and enforcement mechanism, if put in the
law and the subsequent enforcement mechanism prove incapable
                                                                                       appropriate ethical context. Section 4 introduces the LegiCrowd
of keeping pace. Concepts originating from the past like consent
                                                                                       platform, a crowdsourced legal document annotation system. Fi-
tend to maintain their static properties in an increasingly complex
                                                                                       nally, Section 5 concludes the paper and provides some thoughts
and dynamic space, thus resulting in a state of obsolescence. The
                                                                                       for future work.
law and its design and implementation properties are in need of
radical update. The present paper argues that such update requires
a transition from plain legal text to a full-fledged Legal Document                    2   FROM LEGAL TEXT TO LEGAL
Management Systems.                                                                        INFORMATICS
   World Wide Web today is the outcome of a three stage evolu-
                                                                                       Ubiquitous automation does not support the static format of the
tion. Web 1.0 refers to the so called static Web of documents in a
                                                                                       online legal documents and the linked consent models. Terms of ser-
unidirectional broadcasting format. Web 2.0 introduced the web
                                                                                       vices and privacy policies in their present form constitute an iconic
of people, by allowing the sharing of user generated content and
                                                                                       proof of inadequacy of the digital design. Complicated legal and
further social networking. Web 3.0 or the Web of data, is currently
                                                                                       technical documents that no one reads, no one understands, and no
evolving under the idea of defining and linking structured data [1]
                                                                                       one cares about, govern the emerging data lifecycles for the benefit
in order to produce formal semantic representations thus introduc-
                                                                                       of data driven business operations by extending their unhealthy
ing massive automation via algorithmically informed decisions. The
                                                                                       operational patterns. A piece of information of such magnitude
Web 3.0 comes however with one major loophole; the lack of legal
                                                                                       turns into an irrelevant node in the data value chain, hindering the
knowledge modelling and representation, which emerges systemic
                                                                                       unfolding and systemic assertion of the evolving human centric pat-
inadequacies in the digital design, as the always hungry-for-data
                                                                                       terns. On top of that, modern businesses increasingly use consent
service supply side conducts a “permissionless invasion”[8]. How-
                                                                                       as a de facto standard for demonstrating privacy commitments and
ever, in a complex dynamic system like the Web of data, algorithms
                                                                                       wider legal compliance claiming consent provisions as proxies of
require huge amounts of high quality and relevant data. We start
                                                                                       informed choice. This evolution has given rise to a situation where
                                                                                       many technology giants, on the pretext of providing improved ser-
WAIEL2020, September 3, 2020, Athens, Greece                                           vices, have begun to track every action of every user with little or
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).                                     no transparency [6]. The result has been that clicking the ‘Agree’
                                                                                       button for consent was dubbed “the Biggest lie on the Internet”[7]
and incidents of data misuse such as unsolicited call, spam and            multiple workflows. In the said legal workflows, the extraction,
deliberate manipulation have resulted in a massive trust deficit.          formulation, and exploitation of related metadata and provenance
And all that formalised by the court’s validation of the ‘I Agree’         constitute a basic processing component towards Machine Learning
button maximising the power asymmetries and the trust deficit.             models or Natural Language Processing applications, capable for
                                                                           more efficient legal enforcement. With high awareness of its poten-
3   THE ETHICS OF LEGAL REPRESENTATION                                     tial societal impact, any decisions about legal data, methods, and
    AND MODELLING                                                          tools tend to tie up with their impact on people and the society in
                                                                           a practical way thus bringing ethics in the automation foreground.
Such a formatting reality and the imposed data dispossession from
the technology and digital service supply side, brings into the sur-
                                                                           ACKNOWLEDGMENTS
face the need for dynamic, data-driven, and data-relevant legal
and ethical enforcement. In the environment of Web 3.0, such an            The LegiCrowd Onto consortium is lead by the French Non Profit
enforcement requires a data driven solution shaped with mathe-             Organisation Association des Professionnels des Industries de la
matical reasoning. It requires the transition to a ubiquitous legal        Langue (APIL), and includes the National Technical University
representation and modelling apparatus; an extended Legal Doc-             of Athens (NTUA) and the Research, Consultant & Training firm
ument Management System comprised by structured legal data,                ‘Future Now’, backed by MyData Greece, the Greek node of MyData
methods, and tools for sufficient syntactic and semantic represen-         Global. It has received funding from the European Union’s Horizon
tation, capable of generating documented, machine readable legal           2020 research and innovation programme under the NGI TRUST
knowledge, using very different logic, norms, and languages.               grant agreement no 825618. This project has been made possible
    The ethical starting point lies on the axiom expressed by [2] that     thanks to Short Term Scientific Missions conducted within the
“The common misconception is that language has to do with words            framework of the enet collect Cost Action ([3], [4], [5]).
and what they mean. It doesn’t. It has to do with people and what they
mean”. It is not about simple language data linking and annotation,        REFERENCES
                                                                           [1] Nupur Choudhury. 2014. World Wide Web and Its Journey from Web 1.0 to Web
rather about providing accurate meaning in the appropriate                     4.0.
context. The aim is a virtuous cycle of legal data structuring, mod-       [2] Herbert H. Clarck and Michael F. Schober. 1992. Questions about question - Enquiries
elling, representation and context in order to: (i) Provide end users          into the cognitive bases of surveys. Russell Sage Foundation - New York, New York,
                                                                               NY, USA, Chapter Asking questions and influencing answers, 15–48.
spot on clear and ascertained information on data processes and cir-       [3] Alain Couillault. 18/5/2018. SHORT TERM SCIENTIFIC MISSION (STSM)SCIENTIFIC
culation; (ii) Provide the supply side proof of concept for technical          REPORT. Technical Report. Apoliade. http://www.enetcollect.net/ilias/goto.php?
and legal compliance throughout the data lifecycle, thus mitigating            target=file_530_download
                                                                           [4] Alain Couillault. 3/3/2019. SHORT TERM SCIENTIFIC MISSION (STSM)SCIENTIFIC
compliance inconsistencies and pertaining risks; (iii) Turn to a stan-         REPORT. Technical Report. Apoliade. http://www.enetcollect.net/ilias/goto.php?
dard design building block; (iv) Enhance platform transparency and             target=file_908_download
                                                                           [5] Alain Couillault. 8/3/2020. SHORT TERM SCIENTIFIC MISSION (STSM)SCIENTIFIC
user confidence and trust; (v) Embed into the increasing B2B, B2C,             REPORT. Technical Report. Apoliade. http://www.enetcollect.net/ilias/goto.php?
C2C as well as Device to Device (D2D) data flows ethical require-              target=file_1053_download
ments, like human agency and oversight, technical robustness and           [6] Joss Langford, Antti Jogi Poikola, Wil Janssen, Viivi Lähteenoja, and Marlies
                                                                               Rikken. 2019. Understanding Mydata Operators. Technical Report. MyData.org.
safety, privacy and data governance, (OLDs) fairness, accountability,      [7] Jonathan A. Obar and Anne Oeldorf-Hirsch. 2020. The biggest lie on the Internet:
etc.                                                                           ignoring the privacy policies and terms of service policies of social networking
                                                                               services. Information, Communication & Society 23, 1 (2020), 128–147.
                                                                           [8] Tom Wheeler. 2018. Time to Fix It: Developing Rules for Internet Capitalism.
4    THE LEGICROWD APPROACH                                                    Fellows Research Paper Series. Shorenstein Center on Media, Politics and Public Policy
                                                                               (2018).
The LegiCrowd project could be an answer for such a need for
transparency, as it aims at creating a platform to render Online
Legal Documents (OLDs), namely Privacy Notices and Terms of
services, in a quick and easy to read format, such as icons, dataviz
or simplified language through a crowdsourced approach. This
requires first to design a semantically sound annotation tag set, as
an ontology of descriptors. This is the goal of the current LegiCrowd
Onto project, which relies on a number of competencies particularly
related to natural knowledge modelling, law and corresponding
visualisations thereof gathered in an international consortium. Such
a platform aims at truly putting end users in the driver’s seat as it a)
provides an ethical building block in the overall design, b) empowers
end users to extract accurate legal information in context, to assess
the levels of legal compliance and the ethics standards in place and
c) provide or reject a consent on a truly informed basis.

5    CONCLUSION
No doubt, the practice and assertion of law in the Web 3.0 era is a
combination of numerous language data inputs and outputs from