=Paper= {{Paper |id=Vol-273/paper-17 |storemode=property |title=Collaboratively building structured knowledge with DBin: from del.icio.us tags to an “RDFS Folksonomy" |pdfUrl=https://ceur-ws.org/Vol-273/paper_99.pdf |volume=Vol-273 |dblpUrl=https://dblp.org/rec/conf/www/TummarelloM07 }} ==Collaboratively building structured knowledge with DBin: from del.icio.us tags to an “RDFS Folksonomy"== https://ceur-ws.org/Vol-273/paper_99.pdf
 Collaboratively building structured knowledge with DBin:
      from del.icio.us tags to an “RDFS Folksonomy"
                  Giovanni Tummarello                                                Christian Morbidoni
                        DERI Galway                                                       SEMEDIA
                 National University of Ireland                       Università Politecnica delle Marche, Ancona, Italy
                     +(353) 091 495285                                                +(39) 071 2204841
               g.tummarello@gmail.com                                          c.morbidoni@deit.univpm.it

ABSTRACT                                                            participate into Semantic Web communities (from here referred to
DBin is a Semantic Web application that enables groups of users     as “regular” users) or might want to start up and/or maintaining
with a common interest to cooperatively create semantically         them (power users). To participate means to be able to
structured knowledge bases. These user groups, which we call        cooperatively build the community shared semantic knowledge.
“Semantic Web Communities”, are made possible by creating           The power user starts up a new community by first creating a
customized user environments called “Brainlets”. Brainlets          customized user environment for the editing and exploitation of
provide user interfaces and domain specific tools (e.g. querying,   semantically structured annotations. These environments are
viewing and editing facilities) which enable community              called Brainlets.
participants to interact with the data of interest. Brainlets are
directly created by domain experts using an XML description
                                                                    2.1 Brainlets
                                                                    Brainlets [1], are plug-ins in the DBin platform (therefore,
language. DBin clients communicate and exchange annotations
                                                                    technically Eclipse Plug-ins) and can be though as “configuration
using a P2P infrastructure. Access control and digital signatures
                                                                    packages” preparing the client application to operate on a specific
put by DBin inside the authored RDF enable trust and information
                                                                    domain (e.g. Wine lovers, Italian Opera fans etc.). From the user
filtering. In this paper we show a specific use case where a
                                                                    perspective, the relationship between Brainlets and the DBin
“Semantic Web Community” is created to enable a group of users
                                                                    platform is similar to that between HTML and a Web Browser.
to share their del.icio.us tags and organize them into a
                                                                    Much like HTML web sites, Brainlets are created in XML and
cooperatively built RDFS ontology.
                                                                    RDF and do not require any programming skills. They customize
                                                                    aspects such as:
Keywords
Semantic Web, Tags, Ontology creation, DBin, peer-to-peer.                   The ontologies to be used for supporting knowledge
                                                                              creation and presentation of data;

1. DBIN PLATFORM OVERVIEW                                                    GUI Layout and coordination. Widgets are first
                                                                              “instantiated” from a rich set of predefined ones and
The DBin project is an integrated, end-user oriented, Semantic
                                                                              then configured for the domain of interest, e.g., an
Web Platform. More in detail, it is a Semantic Personal
                                                                              ontology navigator might be configured to show certain
Knowledge Manager (Semantic PKM) with the following main
                                                                              classes or instances and to hide others. The components
features:
                                                                              are then interlinked among each other; this means that
        Based on the Semantic Web languages stack                            chains of reactions to actions such as a focus change can
        Topic independent, yet customizable to be domain                     be defined;
         specific.                                                           Templates for domain specific annotations (e.g. a
        Ontology based reasoning used whenever possible for                  “Movie Brainlet” might have a “Review” template, with
         assisting the user (e.g. automatic rich user interface               associated slots, that users can fill);
         creation) in visualizing, editing and browsing data;                Templates for readily available “pre-cooked” domain
        Works as personal information manager and is run in a                queries, which are structurally complex domain queries
         local desktop environment.                                           with only a few simple free parameters (e.g. “give me
                                                                              the name of the cinemas where a movie of genre X is
        Using a P2P algorithm, it can synchronize aspects of the
                                                                              being shown tonight”);
         local knowledge with that of other online DBin users.
                                                                             A trust model and information filtering rules for the
        Is not a programmer toolkit. Most customizations can be
                                                                              domain (e.g. public keys of well known “founding
         done using XML scripting languages and ontologies.
                                                                              members” or authorities, preset “browsing levels”);
        Rich client multiplatform software. Based on the
                                                                             Scripts for guiding the user in creating new URIs for
         Eclipse RCP, enjoys its plug-in system.
                                                                              domain resources (e.g. adding a new "paper" to the
                                                                              knowledge base);
                                                                             Scripts connected to Brainlet specific menus or buttons
2. SEMANTIC WEB COMMUNITIES: THE
                                                                              that implement domain specific functions;
USER EXPERIENCE
                                                                             Support material, customized icons, help files etc.;
In this section we present the overall user interaction model as
implemented by the DBin platform. Users might simply want to
         Optionally Brainlets might contain support to Java code     possible for a given resource.
          and libraries for add on capabilities beyond those
          provided by the standard Brainlet widgets;                  2.2 The overall scenario
                                                                      Once Brainlets have been created by power users, they are
         A basic RDF knowledge package.
                                                                      installed by the regular users into their local DBin client.
                                                                      Brainlets are downloadable files and as such they can be made
                                                                      available at a Web site by their creator. DBin itself, however,
                                                                      provides a mechanism for discovering new Brainlets as the user is
                                                                      browsing the P2P channels; as a user join a channel which was
                                                                      created for the users of a specific Brainlet, DBin will optionally
                                                                      guide the user to the Brainlet download and installation.
                                                                      The overall scenario is depicted in Figure 3. On top of what has
                                                                      been illustrated in the previous section, Brainlets also have roles
                                                                      in how a user can connect to the others. In particular, a Brainlet
                                                                      contains pointers to P2P channels which are either known to
                                                                      contain information pertaining to the domain of interest or that the
                                                                      power user has previously created for this purpose. Creating a
                                                                      P2P channel for a specific topic is a simple operation that has to
                                                                      be performed on the configuration of an RDFGrowth server.
                                                                      RDFGrowth servers act as “meeting point” for the DBin clients
                                                                      but do not carry themselves metadata or binary attachments.
                                                                      Binary attachments are stored by DBin automatically in a web
Figure 2. A Brainlet as experienced by an end user. The               accessible space. This is done by DBin interfacing with a web
Semantic aware widget are positioned and made to                      publishing system much similar to WebDAV1 which we call
interoperate by the Brainlet configuration.                           “Data Publishing Service” (DPS). Unlike WebDAV, a DBin
                                                                      publishing service is a simple PHP script and, as such, it can be
                                                                      deployed with ease in most low cost commercial web hosting
To the end user, most of the above aspects are simply hidden          environments. For the end user convenience, the DBin platform
behind the integrated Brainlets UI which presents itself, for         comes with a default DPS setting2. The same Data Publishing
example, as shown in Figure 1 (ESWC Budva Brainlet).                  mechanism provides the DBin users with the ability to create and
It is important to notice that the Brainlet UI is not simply a mash   publish RSS feeds and RDF dumps derived from the internal
up of visualizers. As the components are coordinated among each       knowledge.
other, the result is that a Brainlet guides the user into a           The Brainlet provides for a domain specific user interface as it
meaningful and domain specific workflow interaction with the          instantiates and positions RDF aware widgets which are
structured data. At any time, the domain ontologies are used as       connected together to create an application workflow. It is
much as possible for assisting users in editing and browsing          important however to notice that they do not “take over” the
knowledge, for example to suggest which kind of annotations are       individual installations; many Brainlets can coexist as needed.




                                                                      1
                                                                          http://www.webdav.org/
                                                                      2
                                                                         Which
                    Figure 3 DBin and its relationship with different actors     uses
                                                                             in the    our installation
                                                                                    "Semantic            of the Data Publishing Service
                                                                                                Web Communities"
                                                                        located at http://public.dbin.org
2.3 The RDFGrowth P2P algorithm                                       opinions about the tools, pointing at web tutorials or at web sites
In this section we quickly overview the basic ideas and principles    that use specific technologies. On top of pure metadata,
behind the RDFGrowth P2P metadata exchange algorithm, refer           annotations, they can also point at rich media posted on the web
to [2] for a complete description of the algorithm.                   (e.g. pictures, documents, long texts, etc.). Other users who
                                                                      receive such annotations in the group can then reply or further
Unlike previous approaches, which have explored P2P                   annotate each of these for their personal use or into public
interactions among peers based on distributed queries, collecting     knowledge.
and returning results, as in works like [3], [4], [5] and [6],
RDFGrowth operates in a “greedy” and uncommitted scenario             As mentioned earlier, the operator that selects which resources a
where cooperation between peers is minimal. It operates by direct     client shares with the others is the GUED. A GUED for the Web
queries that are in general of fixed computational cost. Without      development community might contain queries such as “all the
going into details, the algorithm provides synchronization of RDF     resources of type WebTechnology”, with respect to a specific
knowledge among the user’s DBin installations. Such                   ontology, chosen or developed by the community’s creator, where
synchronization is not performed in full, but along “aspects” of      the class WebTechnology is defined. Only the metadata involving
knowledge; it is restricted to those RDF triples which are very       resources that fit this definition of 'common interest' are made
closely connected with a set of URIs defined “interesting” by a       available by a peer to the others in the community. In this case
community “banner”. The P2P community creator, usually the            such metadata would be for example statements like “Web site X
same person who created the Brainlet, defines an “URI interest        uses web technology Y” or “Web page X deals with issues in
banner”, that we call Group URIs Exposing Definition (GUED),          using technology Y”.
usually queries which have as a result a list of URIs. An example     Users like Bob (marked B) have interests, which go beyond those
of GUED can be “select all resources of type Papers which have        of a single community. In this example Bob is interested in
topic Semantic Web”. Upon joining a community, a peer runs            developing a collaborative tagging application, so he joins both
such queries to select the local set of resources about which         the ‘Web development’ community and the ‘collaborative
knowledge will be synchronized with that of the other                 systems’ one, thus being able to import into his own DBin
participants.                                                         metadata coming from the two sources. At this point Bob is able
At user interaction level, DBin shows an interface that is            to make joint queries across the two domains, e.g. “which are the
somehow similar to that of popular file-sharing software. A list of   technologies on which existing collaborative systems are based
servers is presented and, upon selecting one, the list of semantic    on”. Finally, Carole (marked C), is a Semantic Web researcher, so
P2P channels is displayed for the user to join. Furthermore, an       she might decide to join all the communities as they all contain
access control mechanism allows for restricted P2P groups.            information which might be useful for her research activity.
                                                                      The interconnection between Semantic Web Communities can be
3. INTERACTION AMONG                                                  seen also under a second, very novel point of view. If
COMMUNITIES                                                           Communities share identifiers (e.g. their own URLs for available
It is interesting to see how multiple Semantic Web Communities        web applications, URLs of their specification for web
relate both to each other and to the individual user.                 technologies) then an annotation (e.g. web site X is based on
                                                                      technology Y), originally posted in one community is
Figure 4 shows a possible use case where each user participates in    automatically cross posted to the other community since the URI
one or more communities with different topics of interest.            is of interest to both (belongs to the GUED of both groups). This
                                                                      aspect, to our opinion, represents a particularly novel feature of
                                                                      Semantic Web Communities as a communication mean.
                                                                      Information in fact flows across group boundaries when it is in
                                                                      fact relevant to the users participating in the different
                                                                      communities. This is opposed to what happens with traditional
                                                                      means such as mailing lists, web forums or newsgroups where
                                                                      information, arguably, has to be manually cross-posted.

                                                                      4. THE DEL.ICIO.US BRAINLET
                                                                      The tagging paradigm is increasingly been adopted by people for
                                                                      organizing web resources they visit. Systems like del.icio.us3
                                                                      allows to associate simple keywords to web resources while the
                                                                      user is navigating the Web. However, such applications only
                                                                      allow annotations to be a flat list of terms, while it would be
                                                                      obviously useful to organize them in taxonomies or establish
Figure 4 An example of users participating in multiple                relations among them and possibly with existing ontologies. In
communities                                                           this section we illustrate the del.icio.us Brainlet, that deals with
                                                                      this issue.
                                                                      To think of a specific use case let us consider a group of
Users in groups such as that of Alice (marked A in the                colleagues, each one using del.icio.us to tag web articles and
illustration) are Web developers. Within their community the          resources of interest for their work. They also use a knowledge
resources of interest are, for example, available web technologies
and tools (such as PHP, Ajax, JSP, etc.). Participants in the
community annotate such resources for example expressing              3
                                                                          http://del.icio.us
management application (such as DBin) for cooperating and               The screenshots shows this Brainlet in action. In Figure 5, the
organizing internal documents. It is likely that a subset of the tags   ontology view visualizes the taxonomy of the classes and provides
they created in their del.icio.us accounts will be conceptually         functionalities to add new classes and subclasses as items of the
related to or equivalent to some concepts present in the domain         tree, while the tags view shows the flat list of tags and gives the
ontology. Using the DBin del.icio.us Brainlet it is possible to         capability to update such a list from a del.icio.us account. Once a
import lists of tags into the local RDF store, transform them into      tag has been selected, Web pages which have been marked with
ontology classes and insert them in the class hierarchy.                that tag are listed in the related bookmarks view and the their
                                                                        content can be displayed in the browser view.
                                                                        Upon selecting a tag (e.g. J2EE), a “transform into a sub-class”
                                                                        action is available to state that a tag is a sub-concept of a class in
                                                                        the ontology (e.g. Technology). This results in a new class being
                                                                        added to the ontology. As shown in Figure 6, when the user
                                                                        selects the class Technology, the web pages tagged with ‘J2EE’
                                                                        are displayed in the right view, as such a tag has been stated to be
                                                                        a specification of the concept of technology.
                                                                        The tags, as well as the pages and the other ontological terms, can
                                                                        then be annotates as any RDF resource in DBin. This enables
                                                                        annotations with comments, binary attachments, votes and any
                                                                        kind of structured annotation as defined by the Ontologies.

                                                                        5. REFERENCES
                                                                        [1] Tummarello, G., Morbidoni, C., Nucci, M. and Panzarino, O.
                                                                        Brainlets: "instant" Semantic Web applications. In Proceedings of
                                                                        the 2nd Workshop on Scripting for the Semantic Web at the
Figure 5. Upon selecting a tag the related bookmarks are listed         European Semantic Web Conference (Budva, Montenegro, 2006)
 and each of them can be visualized in the embedded browser.            [2] Tummarello, G., Morbidoni, C., Petersson, J., Puliti, P.,
                                                                        Piazza, F. RDFGrowth, a P2P annotation exchange algorithm for
                                                                        scalable Semantic Web applications. 1st International Workshop
                                                                        on Peer-to-Peer and Knowledge Management (Boston, USA,
                                                                        2004)
                                                                        [3] Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A.,
                                                                        Nilsson, M., Palmer, M. and Risch, T. EDUTELLA: A P2P
                                                                        Networking Infrastructure Based on RDF. In Proceedings of the
                                                                        International World Wide Web Conference (Honolulu, Hawaii,
                                                                        2002)
                                                                        [4] Cai, M. and Frank, M. RDFPeers: A Scalable Distributed RDF
                                                                        Repository based on A Structured Peer-to-Peer Network. In
                                                                        Proceedings of the 13th International World Wide Web
                                                                        Conference (New York, USA, 2004)
                                                                        [5] Nejdl, W., Siberski, W., Wolpers, M., Lser, A. and
                                                                        Bruckhorst, I. SuperPeer Based Routing and Clustering Strategies
  Figure 6. The JSEE tag has been identified as a sub-class of          for RDF Based Peer-To-Peer Networks. In Proceedings of the 12th
 the Technology class, that automatically inherits the relation         International World Wide Web Conference (Budapest, Hungary,
          with the web resources tagged with JSEE.                      2003)
                                                                        [6] Chirita, P. A., Idreos, S., Koubarakis, M. and Nejdl, W.
                                                                        Publish/Subscribe for RDF-based P2P Networks. In Proceedings
By using the DBin P2P capabilities, such process is cooperative
                                                                        of the 1st European Semantic Web Symposium (Heraklion,
across the team. If necessary, DBin digital signature infrastructure
                                                                        Greece, 2004)
would enable each team member to apply filters to see only
contributions from certain members.