Publishing Semantic Personal Notes as Linked Data Laura Drăgan Alexandre Passant Tudor Groza DERI, National University of DERI, National University of DERI, National University of Ireland, Galway Ireland, Galway Ireland, Galway laura.dragan@deri.org alexandre.passant@deri.org tudor.groza@deri.org Siegfried Handschuh DERI, National University of Ireland, Galway siegfried.handschuh@deri.org ABSTRACT it requires two prerequisite steps, provided by existing ap- There is an obvious shift of focus towards the Web, as peo- plications: (i) the note-taking process and annotation of the ple spend more of their time and store more of their data note (adding the context), and (ii) the identification of Web online. However, the desktop still handles a large amount of URIs which represent the same real-world thing as the desk- the personal data. The Semantic Desktop brings semantics top resources that belong to the context of a note. to the desktop, better interlinking and organization of the data, thus allowing better management. However, in spite Our contributions include: (i) mappings between the rela- of common representation formats, personal and online data tively small number of desktop vocabularies and the most is still difficult to interlink, notably because of the different popular Web vocabularies. The mappings are used in the vocabularies used to describe it as well as the lack of com- transformation of the desktop data, represented with the mon identifiers between desktop applications and Web-based desktop ontologies, to data represented with the Web vo- services. We describe here a solution for easily publishing cabularies, ready to be published online; (ii) a process for and sharing of personal notes as Linked Data. We provide the publication of desktop information on the Web using the a two-phased publishing and interlinking process. It can be Linked Data principles, while at the same time protect the used to publish any kind of information from the desktop to sensitive private data from being shared unwillingly; (iii) a the Web, enabling integration of small chunks of personal system implementation that allows sharing of semantic per- knowledge into the Web of Data and focusing on a user- sonal notes as semantic blog posts, interlinked with existing driven approach of knowledge management. information within the Linked Data cloud. 1. INTRODUCTION 2. USE CASE - SEMANTIC BLOGGING The Linked Data initiative and the Semantic Desktop are Two important characteristics of blog posts are: (i) their two relevant subdomains of the Semantic Web vision. The topics are of interest to the author and thus are very likely first focuses on globally interlinking resources on the Web, to have references to things present on the desktop; (ii) they while the second enables information integration and inter- belong to a context consisting of the references made in their linking on the desktop. Although the two share the represen- content. Writing a blog post in a desktop application can tation models (RDF(S)/OWL), there is still a gap between offer several benefits, if the application is a semantic one, data from the Web and the desktop, due to different sets of on a Semantic Desktop, where desktop resources are inter- vocabularies that are used, and generally distinct identifiers. connected [1]. Semantic note-taking tools like SemNotes1 automatically generate relations between the notes and the We describe here an approach for integrating data from desktop things mentioned in their content. Such annotations these two environments, specifically for publishing personal give context to the note and should be preserved when the notes from the desktop (using Semantic Desktop technolo- note is published as a blog post on the Web, since it enables gies) to the Web (using the Linked Data principles). The serendipitous browsing and information discovery, as they main requirement is to publish data online without losing contain relevant additional links to other entities. the personal context established on the desktop. Our ap- proach consists of two main steps: (i) preparing the desktop Currently, personal notes, even the ones semantically en- data for sharing, and (ii) publishing it online. In addition, riched using Semantic Desktop applications, must be pub- lished as blog posts by being manually copied into a blog- ging tool. In this way, any additional semantic information available on the desktop becomes lost or, if copied, leads to broken references as they point to the local resources which are not accessible outside of the desktop. The note-taking to publishing process is sometimes shortcut by using the drafting functionality that some systems like WordPress or Blogger offer, so that users can directly take the notes in 1 http://smile.deri.ie/projects/semn the blogging tool, usually online, thus replacing the desk- context of the Semantic Web browser Haystack [4]. The ex- top note-taking application. Using online tools deprives the isting systems for semantic blogging fall into two categories: user from having the personal context automatically added (i) desktop applications that involve publishing the actual to the blog post, since desktop information cannot be easily local resource information together with the blog post, or integrated in Web-based interfaces. (ii) online applications that do not have access to desktop data relevant to the user. The first category, represented by 3. SYSTEM OVERVIEW tools like SemiBlog [5] or SemBlog [6], gives the user bet- We propose an approach that enables the publishing and ter access to the relevant data from the desktop. However, sharing of personal notes by extending the functionality pro- both tools require that the resources that contain private in- vided by SemNotes. The process has two steps: (i) trans- formation are published together with the blog post, which formation and (ii) publication. In the first step, the note is might lead to privacy issues. They are used for exchange of transformed locally for publication, and private local data is personal information in the blog posts, which differs from replaced with public server references. In the second step, our approach of using already published web data as to pro- the transformed note is published online on a dedicated tect the privacy of the personal information. The process “SemNotes server”, where the resources referenced and the described implies manually adding the metadata, while our tags assigned, are shared between the notes of all users. As approach relies on automatic export. Online services like we mentioned above, there are also two prerequisite steps: BlogAccord [2] for music information or Zemanta2 blogging (i) the note-taking process and annotation of the note, which assistant, belong to the second category. They enhance the is the usual note-taking approach, and (ii) the identification blogging experience by providing access to various online re- of Web aliases for the desktop resources related to a note, sources to create the context of a blog post, but not to the where URIs are mined from the Web for locally defined re- personal context of the user. sources, such as people, events or projects. The annotation is done semi-automatically and is an existing feature in Sem- 5. CONCLUSION Notes. The second prerequisite step consists in finding Web We presented an approach for publishing personal notes as resource for each of the desktop entity linked to the note that Linked Data on the Web. The aim of our work was to pro- is about to be published. This step is executed by a desk- vide a way for publishing and sharing complete information top service that relies on the Semantic Web index Sindice to by preserving the personal context of the notes without com- retrieve results. promising privacy. Our solution makes a step towards bridg- ing the gap between local semantic data and Linked Data. The publishing system is divided between its local part and We defined a publishing process that comprises two steps: its remote part. The local part handles local private data, (i) preparation – the note is transformed into a SIOC-based while the remote one handles online public data. The sep- Web representation; and (ii) publication / sharing – the note aration between them extends over three layers: ontology, is published online following the Linked Data principles. data and application. 6. ACKNOWLEDGMENTS On the ontology level, the NEPOMUK desktop ontologies The work was supported by the Lı́on-2 project funded by Sci- are used locally while popular Web vocabularies like SIOC ence Foundation Ireland under Grant No. SFI/08/CE/I1380. are used on the server-side. These ontologies are used to describe the data exchanged between the applications. Per- sonal desktop data is stored in the local repository, which 7. REFERENCES exists on any NEPOMUK Semantic Desktop, while Web [1] A. Bernardi, S. Decker, L. van Elst, G. Grimnes, data is distributed in the Linked Data cloud. Finally, on T. Groza, S. H. M. Jazayeri, C. Mesnage, K. Moeller, the application level, the local component is an extension to G. Reif, and M. Sintek. The Social Semantic Desktop: SemNotes that provides publishing functionality for notes, A New Paradigm Towards Deploying the Semantic Web and the remote component is a server that hosts and pub- on the Desktop. 2008. lishes online the notes. [2] S. Cayzer. What next for semantic blogging? In Proc. of the SEMANTICS 2006 conference, pages 71–81, The first step of the process is executed on the local side, Vienna, Austria, November 2006. by an extension of the SemNotes application. It consists of [3] S. Cayzer and P. Shabajee. Semantic blogging and replacing the links to the local resources mentioned in the bibliography management. In BlogTalk Proc., 2003. note with their Web aliases, and enriching the content of [4] D. R. Karger and D. Quan. What would it mean to the note with RDFa. Then, the publication step is done by blog on the semantic web? In Proc. of the 3rd ISWC, the server, which receives information from the desktop and Hiroshima, Japan, pages 214–228. Springer, 2004. publishes the note according to the Linked Data publishing [5] K. Möller and S. Decker. Harvesting desktop data for principles. On the server, the notes, linked resources and semantic blogging. In Proc. of Semantic Desktop tags have dereferenceable URIs. The resources and tags Workshop, ISWC, Galway, Ireland, 2005. are shared among notes and users, thus providing object [6] H. Takeda and I. Ohmukai. Semblog project. In centered sociality. The dataset is also linking to external Activities on Semantic Web Technologies in Japan, A resources that are found to be sameAs the local ones. WWW2005 Workshop, 2005. 4. RELATED WORK Semantic blogging was introduced by Cayzer and Shabajee 2 in [3]. Karger and Quan describe semantic blogging in the http://www.zemanta.com