=Paper=
{{Paper
|id=Vol-3632/ISWC2023_paper_495
|storemode=property
|title=Real-time Collaboration in Linked Data Systems
|pdfUrl=https://ceur-ws.org/Vol-3632/ISWC2023_paper_495.pdf
|volume=Vol-3632
|authors=Jonathan Gruss,Andrei Ciortea,Guido Salvaneschi,Simon Mayer
|dblpUrl=https://dblp.org/rec/conf/semweb/GrussCSM23
}}
==Real-time Collaboration in Linked Data Systems==
Real-time Collaboration in Linked Data Systems Jonathan Gruss1,∗ , Andrei Ciortea1 , Guido Salvaneschi1 and Simon Mayer1 1 Institute of Computer Science, University of St.Gallen, St. Gallen, Switzerland Abstract Real-time collaboration has become commonplace in centralized Web applications, but decentralized Linked Data systems still lack readily accessible mechanisms. This demo paper proposes a novel approach that provides a viable solution to implement collaborative Linked Data in the Solid ecosystem using Conflict-free Replicated Data Types (CRDTs) and hypermedia-driven interaction. Specifically, we introduce a dedicated vocabulary for describing interactions with CRDT-based resources hosted in Solid Pods, empowering software clients to dynamically discover means for collaborative editing at run time. In contrast to current solutions for collaborative RDF, our approach works in combination with industry standard CRDTs to offer a seamless co-editing experience in decentralized Linked Data systems. To demonstrate the practicality of our approach, we showcase a Solid-hosted website that utilizes the vocabulary to expose hypermedia controls and a browser extension that effectively consumes these descriptions to enable real-time collaborative editing through CRDTs. By strategically shifting intelligence to the client-side, our approach significantly lowers the entry barrier for publishing real-time collaborative resources on the (Semantic) Web. Keywords Real-Time Collaboration, Linked Data, CRDT, Solid, Ontology, RDF 1. Introduction Many online co-editors, such as Google Docs or Overleaf, enable collaboration on Web resources in real time through Operational Transformation (OT) — a technique for transforming and applying concurrent operations on shared data without conflicts [1]. However, OT algorithms rely on centralized architectures, which introduce scalability issues, limit interoperability, and require end users to give up control over their data. In decentralized Linked Data systems, such as those based on Solid [2], there are no straightforward solutions for real-time collaborative editing. Related work on real-time collaborative Linked Data is using Conflict-free Replicated Data Types (CRDTs) [3] to implement strong eventual consistency for RDF graphs [4, 5, 6]. While these approaches focus on the implementation of a shared RDF data type, they do not consider the discoverability of the collaboration interface, the integration of popular shared data formats, or the hosting and access control of the Linked Data. Additionally, current CRDT implementations ISWC 2023 Posters and Demos: 22nd International Semantic Web Conference, November 6–10, 2023, Athens, Greece ∗ Corresponding author. Envelope-Open jonathan.gruss@student.unisg.ch (J. Gruss); andrei.ciortea@unisg.ch (A. Ciortea); guido.salvaneschi@unisg.ch (G. Salvaneschi); simon.mayer@unisg.ch (S. Mayer) Orcid 0009-0004-2927-0298 (J. Gruss); 0000-0003-0721-4135 (A. Ciortea); 0000-0002-9324-8894 (G. Salvaneschi); 0000-0001-6367-3454 (S. Mayer) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings for RDF use RDF triples as the smallest directly manageable piece of knowledge, while our approach allows for more fine-grained collaborative editing. Instead of providing a single shared data type for RDF, our approach allows for a choice of CRDT implementations such as Yjs [7] and Automerge [8] and a combination of arbitrary data types. This makes it possible to have more fine-grained collaborative editing of Linked Data, e.g. using text-based CRDTs with single-character updates for an RDF node, which creates a collaborative editing experience that is similar to popular co-editors. In this paper, we propose a novel approach based on CRDTs for enabling collaborative Linked Data in the Solid ecosystem. We introduce a vocabulary for describing interactions with CRDT- based resources hosted in Solid Pods, which allows software clients to discover at run-time the required means for collaboratively editing the resources. To demonstrate our approach, we showcase a Web blog stored in a Solid Pod that uses our vocabulary to expose hypermedia controls for its collaboration interface. A Chrome extension then consumes these descriptions and makes the Web blog’s content and its Linked Data annotations collaboratively editable through CRDTs. In contrast to relying on OT servers, our approach shifts the intelligence to the client — and lowers the entry-barrier for making Web resources collaborative in real-time. The full code and ontology is publicly available at https://tinyurl.com/55epp24d 1 and a demonstrator video is available online 2 . 2. Building Real-time Collaborative Linked Data Systems Our approach to collaborative Linked Data is inspired by the principles of local-first software [9] to enable user-controlled and privacy-preserving collaboration using CRDTs and intelligent clients. We use the Solid protocol to provide decentralized and portable data hosting with access control, and introduce a vocabulary that complements the Hypermedia Controls Ontology (HCTL) 3 to create hypermedia controls for real-time collaboration on CRDT-based resources. CRDT-based Representation of Linked Data Resources An alternative to the OT ap- proach used by traditional collaborative editors are Conflict-Free Replicated Data Types (CRDTs), which allow replicas to be updated concurrently and without coordination: inconsistencies are automatically resolved and the system is guaranteed to eventually converge. Hence, CRTDs allow to represent and manage collaborative resources without relying on a central coordinator, which makes them well-suited for decentralized Linked Data systems — since we do not require an expensive origin server to integrate the changes. While some efforts have been made to implement CRDTs for RDF, they do not offer the update granularity of popular text-based co-editors. To address this shortcoming, we propose to create a solution that can be used with various CRDT implementations and specifically test it on the widely-used Yjs and Automerge libraries. Both of these libraries are based on documents containing a combination of various CRDTs such as maps, arrays, or text, and the entire document state can be represented in a JSON format with the specific CRDTs as 1 GitHub repository 2 https://clipchamp.com/watch/tZdpD5ONm6t 3 https://www.w3.org/2019/wot/hypermedia# properties. As a result, by adding specific keywords such as a @context to the document, it becomes feasible to create CRDTs that can be represented in the JSON-LD format using popular CRDT libraries. Thus, this allows us to provide more targeted and fine-grained collaboration on RDF nodes by giving users more choice over the employed CRDT, e.g., using text-based CRDTs for nodes with longer text. Additionally, this open approach allows us to potentially support a large range of widely-used CRDT frameworks. Additionally, CRDT documents are often stored and shared in binary implementation-specific formats for conciseness and performance reasons, necessitating the specific CRDT framework to interpret the data. To address this, we introduce a textual representation of the binary CRDT, allowing any Web client to access the most recent state of the CRDT; only clients actively collaborating require specialized software (e.g., a Web browser plugin) to resolve the CRDTs. The server stores both the binary CRDT document including the complete change history and a textual representation of the latest state. Updating the textual representation operates similarly to a Git commit-and-push, enabling users to contribute their local state. Collaborative Resource Description Onotology (CRDO) Traditional collaborative editors rely on fixed contracts imposed by static APIs, which necessitates hard-coding into clients and limits interoperability among different systems. In contrast, the Semantic Web already provides means to support more open interactions on the Web through hypermedia controls, by using vocabularies such as Hydra and HCTL. We propose CRDO, a vocabulary that complements existing ones to create hypermedia controls for collaborative Web resources, describing the collaborative document and advertising possible client actions for reading, manipulating, and synchronizing with peers. The crdo:CollaborativeResourceDescription , which is used to declare the entry point of a collaborative Web resource, is central to the CRDO vocabulary. Such descriptions may contain crdo:DocumentDescription and crdo:TextualRepresentaionDescription . The former lists possible actions for the CRDT document and metadata such as the utilized CRDT framework, which is relevant because mainstream CRDT frameworks use implementation- specific formats and synchronization mechanisms. The latter specifies possible actions for clients to interact with the current public version of the CRDT state in textual form. We refer to the possible actions that are advertised to clients by both descriptions as Collabo- rativeOperations (CoOp); these are the key concept of our collaborative resource vocabulary. To foster reuse and interoperability, CoOp are based on the Form concept from HCTL. Our vocabu- lary additionally introduces a differentiation between Web CoOp and Framework CoOp. Web CoOp represent classic HTTP-based interactions and is aligned with HCTL. Framework CoOp are used for interactions that are mainly handled or executed by the CRDT framework of the re- source. Framework CoOp can be necessary, for example, to establish real-time synchronization with a server or peers using WebSocket or WebRTC. Beyond facilitating collaborative resource descriptions and hypermedia controls, CRDO also introduces terms to describe textual CRDT representations in a Linked Data format. The CRDO vocabulary provides classes, properties, and data types to declare the Web resource as a textual representation, link to the collaborative resource description, and annotate resource values with the utilized CRDT data type. Solid Hosting and Access Control To support open collaboration on Linked Data, our approach uses Solid Pods to host CRDT documents, their textual representations, and their semantic descriptions. By hosting these resources on a Solid Pod, we can utilize its fine-grained, decentralized access control mechanism which makes it possible to, e.g., create a publicly discoverable version of the resource representation in RDF while restricting access to the hypermedia controls and shared resources for collaboration. Another important aspect is that the control of data is given back to the end user, which is consistent with the use of CRDTs for collaborative editing — since CRDTs do not require a central coordinator. Thus, the user is in full control not only of the local CRDT document, but also of its online counterpart and RDF representation. 3. Demonstration To showcase the practical implementation of our approach, we demonstrate real-time collabora- tion on a static website hosted in a Solid Pod, allowing collaborative editing of the website’s content directly within a web browser. In addition, we integrate a GUI-based Linked Data editor, enabling users to update the Linked Data context of the website’s content. We accomplish this by providing a collaborative resource description of the content using our vocabulary. This description can be consumed by a software client; in our case, we developed a Chrome browser extension for this purpose. The extension utilizes the discovered CRDT state or creates a new CRDT state of the content using the Yjs 4 CRDT framework. Moreover, it leverages the CoOp specified in the description to facilitate user collaboration through three main functionalities: (i) presenting a pop-up with actions to initiate the collaboration, synchronize changes in real-time with other peers using WebRTC, and commit the current state to the Solid Pod; (ii) converting the website elements that utilize the collaborative content into editable elements and synchronizing their content with the state of the CRDT document; and (iii) introducing a Linked Data editor panel offering utility functions to read and update the term definitions of the content’s context. 4. Conclusion Our approach significantly simplifies the implementations of real-time collaboration in Linked Data Systems by providing vocabulary to describe the collaborative interface. Moreover, it ensures access control, interoperability, and discoverability of shared data types by leveraging the Solid protocol, hypermedia controls, and CRDTs with textual representations. Ongoing and future work on our current approach include testing and expanding our system’s support for various CRDT implementation frameworks and formats as well as the validation of the approach in other realistic use cases to assess its versatility and effectiveness. We furthermore plan to publish a specification and stable implementation for broader adoption and feedback, and to explore the potential extension of our approach beyond the Solid ecosystem as well as its application in existing Linked Data systems. 4 https://github.com/yjs/yjs References [1] C. Sun, C. Ellis, Operational transformation in real-time group editors: Issues, algorithms, and achievements, in: Proceedings of the 1998 ACM Conference on Computer Supported Cooperative Work, CSCW ’98, Association for Computing Machinery, New York, NY, USA, 1998, p. 59–68. URL: https://doi.org/10.1145/289444.289469. [2] E. Mansour, A. V. Sambra, S. Hawke, M. Zereba, S. Capadisli, A. Ghanem, A. Aboulnaga, T. Berners-Lee, A demonstration of the solid platform for social web applications, in: Proc. of WWW ’16 Companion, WWW Conferences Steering Committee, 2016, p. 223–226. URL: https://doi.org/10.1145/2872518.2890529. [3] M. Shapiro, N. Preguiça, C. Baquero, M. Zawirski, Conflict-free replicated data types, in: X. Défago, F. Petit, V. Villain (Eds.), Stabilization, Safety, and Security of Distributed Systems, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 386–400. [4] L. D. Ibáñez, H. Skaf-Molli, P. Molli, O. Corby, Synchronizing semantic stores with commu- tative replicated data types, in: Proc. of WWW ’12 Companion’, ACM, 2012, p. 1091–1096. URL: https://doi.org/10.1145/2187980.2188246. [5] M. D. Mechaoui, N. Guetmi, A. Imine, Towards real-time co-authoring of linked-data on the web, in: Computer Science and Its Applications: 5th IFIP TC 5 International Conference, CIIA 2015, Saida, Algeria, May 20-21, 2015, Proceedings 5, Springer, 2015, pp. 538–548. [6] H. Zarzour, M. Sellami, srCE: A Collaborative Editing of Scalable Semantic Stores on P2P Networks, Int. J. Comput. Appl. Technol. 48 (2013) 1–13. URL: https://doi.org/10.1504/IJCAT. 2013.055562. [7] P. Nicolaescu, K. Jahns, M. Derntl, R. Klamma, Yjs: A Framework for Near Real-Time P2P Shared Editing on Arbitrary Data Types, in: Proceedings of the 15th International Confer- ence on Engineering the Web in the Big Data Era - Volume 9114, ICWE 2015, Springer-Verlag, Berlin, Heidelberg, 2015, p. 675–678. URL: https://doi.org/10.1007/978-3-319-19890-3_55. [8] M. Kleppmann, A. R. Beresford, Automerge: Real-time data sync between edge devices, in: 1st UK Mobile, Wearable and Ubiquitous Systems Research Symposium (MobiUK 2018), 2018, pp. 101–105. [9] M. Kleppmann, A. Wiggins, P. van Hardenberg, M. McGranaghan, Local-First Software: You Own Your Data, in Spite of the Cloud, in: Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2019, ACM, 2019, p. 154–178. URL: https://doi.org/10.1145/3359591. 3359737.