=Paper= {{Paper |id=Vol-175/paper-11 |storemode=property |title=Nabu - A Semantic Archive for XMPP Instant Messaging |pdfUrl=https://ceur-ws.org/Vol-175/40_kiesel_nabu_final.pdf |volume=Vol-175 |dblpUrl=https://dblp.org/rec/conf/semweb/OsterfeldKS05 }} ==Nabu - A Semantic Archive for XMPP Instant Messaging== https://ceur-ws.org/Vol-175/40_kiesel_nabu_final.pdf
                    Nabu – A Semantic Archive for XMPP Instant Messaging
                                   Frank Osterfeld, Malte Kiesel, Sven Schwarz
                                    DFKI GmbH - Knowledge Management Dept.
                                         Erwin-Schrödinger-Straße, Bldg. 57
                                          D-67663 Kaiserslautern, Germany
                                {frank.osterfeld, malte.kiesel, sven.schwarz}@dfki.de

                          Abstract                                    While e-mail can be archived in a long-term manner on
                                                                   server-side using the IMAP standard, there is no standard for
     Instant messaging (IM) has become more and more               archiving IM conversations. Chat logs are mostly stored lo-
     common these days, and is complementing e-                    cally on the client machine, using proprietary file formats.
     mail and other means of electronic communication.             This has several disadvantages: storing the archive locally on
     However, due to its heavily context-dependent na-             the client computers is inconvenient when using more than
     ture, searching archives of instant messages using            one computer, archives are spread over different installations,
     only full text search is a tedious task. Also, in             and they quickly become out of sync. In addition, informa-
     contrast to mails, files, and other electronic me-            tion gets lost easily. Using proprietary, client- and protocol-
     dia, instant messages typically do not feature a              specific formats to store the information complicates manag-
     unique identifier or location, making it difficult to         ing and searching the stored information using other inter-
     reference a particular instant messaging conversa-            faces than the client UI.
     tion. Nabu is a semantic archive for XMPP in-
                                                                      In this paper we present Nabu1 , an open–source system
     stant messaging designed to address these problems
                                                                   providing server-side logging of instant messages. Nabu is
     by implementing a semantic message store, using
                                                                   implemented for the XML-based Jabber/XMPP protocol2 .
     RDF(S) as its storage format. It is implemented
                                                                   Unlike other proprietary IM protocols from major providers
     as a server module and will log messages, manage
                                                                   such as Yahoo!, MSN or AOL, Jabber/XMPP is an open stan-
     access control to the archives on a per-user basis,
                                                                   dard. Most server and client software is available under open
     and allow other components to observe and anno-
                                                                   source licenses, which makes it possible to add Nabu’s fea-
     tate messages.
                                                                   tures as a plugin for an existing server implementation. The
                                                                   Jive Messenger XMPP server3 was chosen due to its well-
                                                                   designed and well-documented code base and easy extensi-
1   Introduction                                                   bility.
The importance of instant messaging (IM) for private and or-          Nabu tries to integrate instant messaging into the efforts
ganizational communication has increased over the last years.      made in the Semantic Web [Berners-Lee et al., 2001] commu-
IM, the instant sending and receiving of (mostly short) text       nity to store and retrieve information in a unified way. It uses
messages between two or more users, complemented by a list         the Semantic Web standard RDF4 to describe the stored infor-
of peer contacts along with their online status, has become        mation on the server. For retrieving the stored information, it
one of the most used communication channels on the inter-          supports the SPARQL query language [Eric Prud’hommeaux,
net, and more and more valuable information is exchanged           2005], which is currently going through the standardization
via instant messages, especially among colleagues at work.         process at the W3C.
   Despite of the increasing amount of information ex-
                                                                      Using XMPP as transport protocol for SPARQL queries
changed, IM client support for archiving and searching the
                                                                   and commands has several benefits. For example, XMPP
messages exchanged is poor. This is understandable, as on
                                                                   takes care of authentication and encryption; also, XMPP uses
the one hand, most IM client applications are intended for
                                                                   a persistent connection, delivering higher performance than
private users for whom other features are more important.
                                                                   protocols that use non-persistent connections such as HTTP,
On the other hand, IM messages are typically very short and
                                                                   which is used as transport protocol by XML–RPC and SOAP.
heavily tied to their particular context, thus making efforts to
organize the archive of exchanged messages a lot more dif-            In addition to the logging of chat messages, further features
ficult than it is the case with other means of communication,      of Nabu are:
such as e-mails, where the text is essentially self–contained.       1 http://nabu.opendfki.de/
Moreover, e-mails come with a variety of additional informa-         2 http://www.jabber.org/
tion such as a subject or thread references which are usually        3 http://www.jivesoftware.org/messenger/

missing in instant messages.                                         4 http://www.w3.org/RDF/
    • Users can add further metadata to the logged messages
      by adding their own RDF statements. That way informa-
      tion can be categorized and structured, making retrieval
      of relevant information easier.
    • Nabu supports sharing of logged messages between
      users, e.g., making a conference log available to the
      other group members. Privacy is ensured by a strict pri-
      vacy model, restricting access to explicitely authorized
      users.
    • Nabu integrates instant messaging into the context elic-                           Figure 1: Nabu Components.
      itation framework of EPOS [Schwarz, 2005], send-
      ing message notifications to the EPOS user observation              The graph shown in Figure 1 describes the top-level com-
      (when enabled by the user). Other applications can also          ponents of Nabu.
      receive these events by registering with the Nabu com-              The central component of Nabu is the Archive. The Archive
      ponent.                                                          contains the RDF model, consisting of the public model that
   The rest of this paper discusses related work, Nabu’s archi-        stores the conversation logs which can be accessed from out-
tecture, the RDF schemes used, RDF access control mecha-               side, and an internal model, managing internal configuration
nisms, Nabu’s observation feature, and possible applications,          data and privacy policies. The Nabu implementation uses
followed by conclusions.                                               Jena [Andy Seaborne et al., 2005] for RDF handling. Models
                                                                       are stored persistently in a database. The database backend is
Related Work                                                           fully encapsulated by Jena.
[Karneges and Paterson, 2004] proposed a storage format and               The Archive is accessed in two ways:
protocol for server-side message archives as a Jabber proto-              • Logging: the plugin intercepts XMPP messages, con-
col enhancement. The proposal suggests a simple protocol                     verts them to RDF and stores them in Archive. This
and storage format for message archiving. It defines its own                 is done by the Logger component. The logger compo-
format and does not use existing standards for message stor-                 nent simply takes the message, checks whether logging
age and retrieval apart from XML.                                            is enabled and if so, adds the RDF message to the RDF
   The Haystack project [Dennis Quan and Karger, 2003]                       graph. Message URIs are created using the address of
builds a client for information management by integrating                    the XMPP server, ensuring uniqueness.
various information sources into one frontend, using an in-               • User requests: the RequestExecutor interface allows the
frastructure based on RDF. A messaging model was devel-                      user to, for example, search the RDF graph. It takes
oped [Quan et al., 2003] to represent conversations from var-                parsed requests in the form of request objects, executes
ious communication channels, such as e-mail, news groups                     them, and returns a response object.
and instant messaging, in a unified way. In contrast to Nabu,
Haystack is client-based.                                                 The Nabu Bot component is the interface between the users
   The BuddySpace research project [Eisenstadt and Dzbor,              and the plugin. It parses user requests, creates request objects
2002] extends the presence concept in Jabber (simple of-               and passes them to the RequestExecutor. It takes the returned
fline/online/busy states), and adds information such as geo-           responses, encodes them in a string and sends them back to
graphical location, current work focus etc. Furthermore, it            the requestors.
investigates how such additional semantics can be used to fa-             The Nabu Bot uses the Nabu protocol for transferring user
cilitate collaboration over networks. The BuddySpace Jabber            queries and answers. The commands of the Nabu protocol are
client5 demonstrates the concepts.                                     encapsulated as the body of chat messages. Other bindings to
                                                                       the XMPP protocol are possible8 – also implementation of
2     Architecture                                                     such a binding is quite straightforward.
                                                                          In the following section, we take a look at the RDF
Nabu is implemented as a plugin for the Jive Messenger                 schemas used by Nabu.
XMPP server6 . By implementing Nabu as a server–side com-
ponent, every user of the Nabu–enabled server can use its ser-         3    The Nabu Ontology
vices without requiring the installation of a client–side plugin.
Since there are dozens of XMPP clients7 , this was clearly the         This section explains the most important parts of the ontol-
optimal solution. Also, sharing annotations would be much              ogy, i.e., the RDF schema, Nabu uses for logging. It cov-
more difficult with a client–side implementation since clients         ers the most important classes and properties for representing
cannot be expected to be online at all times. Finally, mov-            messages, accounts, and presence changes in RDF. Nabu han-
ing complexity to the server side nicely fits into the XMPP            dles two types of data:
philosophy.                                                               • The actual RDF data that can be queried externally, i.e.
    5 http://buddyspace.sourceforge.net/
                                                                             logged messages, presence changes and annotations.
    6 http://www.jivesoftware.com/                                       8 For example, Dan Brickley’s foaftown proposes an XMPP binding for
    7 In our department, at least four different clients are in use.   SPARQL.
      Figure 2: RDF representing an Instant Message.                         Figure 3: RDF representing an Account.


  • Internal data, like account settings (e.g., logging en-         File transfers and other activities such as video or voice
    abled/disabled), and privacy policies. This data cannot       chat are currently not supported. Extending Nabu to imple-
    be queried from the outside, and the users will never see     ment this functionality and subclassing the Message class ac-
    the RDF representations. It is only indirectly accessible     cordingly should be trivial.
    through the requests defined in the Nabu protocol.            3.2   Accounts
   In the following sections, the RDF schema of the data that     The Account class represents a user account, as shown in
can be queried externally will be discussed. The schemas of       Figure 3. Every user account is uniquely represented by a
the data that cannot be directly queried will be presented in     Jabber ID, like ”alice@jabber.foo.org“. For privacy reasons,
section 5.                                                        Nabu does not store person–account associations. If such
                                                                  a mapping is required, one may store such information us-
3.1   Message                                                     ing FOAF10 , which already includes a foaf:jabberID property
The most important class in the Nabu ontology is Message,         which allows linking a FOAF:Person to a Jabber ID.
shown in Figure 2.                                                  Every account has a Jabber ID representing the account.
                                                                  However, not every Jabber ID represents an account (see
  • nabu:body is a literal containing the message text.
                                                                  MUC rooms), so Jabber IDs and accounts are not identical.
  • nabu:datetime is the time stamp added by Nabu when
    logging the message. The format is xsd:datetime9 . Note       3.3   Rooms
    that if the sender and receiver are on different servers      A room is a virtual place where two or more users meet and
    and each server has Nabu installed the time stamp will        chat with each other. Every message has one room associ-
    be different, so identical messages cannot be matched         ated, and messages in a room are linked to make conversation
    using the timestamp.                                          tracking easier. Two types of rooms exist, depending on the
  • nabu:sender: Links to the account that sent the message.      chat type:
                                                                     In one-to-one chats, the room is defined by the two per-
  • nabu:receivers: The accounts that received the message.       sons chatting: If Alice chats with Bob, all messages sent by
    In an one-to-one chat, this is a single account, the chat     Alice to Bob and vice-versa are in the ”Alice-Bob-Room“.
    partner. In multi user chat (MUC), these are all accounts     The nabu:previousMessageInRoom property links all mes-
    that received the message, i.e. the accounts that were in     sages sent between Alice and Bob, making it easy for Alice to
    the MUC room when the message was sent. Note that             navigate through all logged messages she sent to or received
    the temporary nick names users have in a MUC room             from Bob. In the Nabu ontology, this kind of room is called
    are ignored; anonymous MUC rooms are not supported.           P2PRoom (Point-to-Point-Room). In RDF, the ”Alice-Bob-
    Nick names are resolved to the corresponding accounts.        Room“ might look like this:
    Also note that the resource part of the participant’s Jab-
    ber ID is omitted in both one-to-one and MUC logs.            
                                                                    
  • nabu:inRoom: The room the message was sent in. For              
    details how rooms are defined see below.                      

  • nabu:previousMessageInRoom: Links to the previous                In multi-user chat, the semantics of a room are slightly
    message in the room. This is useful for tracking con-         different. While the P2PRoom is a Nabu concept and does
    versations and for exploring logged conversations with        not exist in Jabber, the MUC protocol as defined in JEP-
    a specific chat partner over time.                            0045 [Saint-Andre, 2002] introduces the concept of rooms.
                                                                  A room has its own Jabber ID, just like accounts, e.g.,
   Unfortunately, in multi-user chat rooms it is difficult to
                                                                  support@conf.foo.org, so unlike P2PRooms, these ”MUC-
track what message(s) a user is replying to. In practice, most
                                                                  Rooms“ are not defined by their members, but by the room
users prefix their messages with a string denoting the receiver
                                                                  name and topic.
in chatrooms (e.g., ”Frank: Please refrain from doing this.“).
However, Nabu does not address this issue, as other compo-
nents can add annotation as needed using an heuristic.
  9 http://www.w3.org/TR/xmlschema-2/datatypes.html#dateTime       10 http://www.foaf-project.org/
           Figure 4: RDF representing Online Presence.                     Figure 5: RDF representing an Annotation.

                         the user to annotating messages with concepts – it is per-
  
    
                                                                  fectly possible to link arbitrary RDF constructs to chat mes-
                                                    sages. This way Nabu is flexible enough to allow tagging
       support                                 messages with context information such as “The user was
       support@conf.foo.org              looking at the DFKI website when sending this message”.
       conf.foo.org                        Using Nabu’s observation features (see section 7), software
                                                       running on the user’s machine can automatically annotate
  

                                                                  new messages when they arrive with information the anno-
                                                                  tation software has access to. Also, it is possible to create
                                                                  semantic links between messages: for example, it is possi-
3.4      Presence Change
                                                                  ble to implement a more complex heuristic for determining
If presence logging is enabled, every presence change, e.g.,      the reply–chain of a message (see section 3.1 for an expla-
from Offline to Online or from Online to Away, is stored in a     nation on why this is not trivial). Adding this information
PresenceChange instance, as shown in Figure 4.                    to messages does not require extending Nabu – one can also
                                                                  write a client–side component that uses message annotations
    • nabu:status: The new presence status.                       for adding this information instead.
    • nabu:statusMessage: The status message the user set.           As an example of a simple manual annotation, let us clas-
                                                                  sify a (fictional) message: ”Hi Frank, I have yet another great
    • nabu:account: The account that changed its presence         feature for Nabu you could implement“. To make searching
      status.                                                     easier, we want to specify the project the message is related
    • nabu:previousPresenceChange: The last logged pres-          to, in this case Nabu.
      ence change of the account nabu:account. All logged            The CREATESTATEMENT request adds annotations to a
      presences of a user are chronologically linked via the      message:
      nabu:previousPresenceChange property.                       CREATESTATEMENT RESOURCE
                                                                    http://&foo;Message-101
4      Annotations                                                  http://purl.org/dc/terms/subject
                                                                    http://foo/Categories/Projects/Nabu
Nabu enables users to add their own statements to the RDF
store. This makes it possible for users to add metadata to           The first argument must be one of RESOURCE or LIT-
logged messages and share this metadata with their peers.         ERAL and indicates whether the object of the statement
For example, a user could set up a set of categories to file      should be handled as resource URI or as literal string. The
his conversations to facilitate later searching. He could do      following tokens are the (subject, predicate, object) triple rep-
this manually or could use a text classifier and categorize au-   resenting the statement shown in Figure 5.
tomatically. Since instant messages are heavily dependent on         The annotations are reified, which means that each state-
their context (for example, imagine a user receiving an e-mail    ment itself becomes a resource that is linked to subject and
with a question and answering via instant messaging), one         object. That makes it possible to add properties to the state-
may also decide to use annotations to link the messages to        ment. In Nabu, every user-added statement has a property
their context. We will discuss this in sections 7 and 8.          statedBy, linking the creator of the statement. In our exam-
   The user can add any statement he likes (except for state-     ple, this is frank@jabber.foo.org. The statement is ”owned“
ments from the Nabu schema, see below), but it is a good          by the linked account. Only this account can delete the state-
practice to reuse commonly used ontologies. Widespread vo-        ment. Also, users can read the statedBy property and decide
cabularies for metadata and categorization are Dublin Core11      whether they trust the statement or not. Alice might decide
and SKOS12 . We have to stress that Nabu does not restrict        that annotations made by Charlie are useful and take them
                                                                  into consideration, but ignore Bob’s statements.
    11 http://dublincore.org/                                        Nearly every kind of RDF statement can be added. The
    12 http://www.w3.org/2004/02/skos/                            only restriction is that properties from the Nabu ontology are
not allowed for user statements. E.g., dc:subject (dc = Dublin      we were talking about, even if they cannot read the actual
Core13 ) is valid, but nabu:isInRoom is not, because the pred-      message content.
icate nabu:isInRoom is part of the Nabu ontology. This pre-
vents users from currupting (deliberately or not) the Nabu          5.1   Privacy Policies in detail
archive or compromising privacy settings. Properties from           Every privacy policy has
the Nabu ontology are managed by the server and can only be
                                                                      • a name
modified indirectly by commands of the Nabu protocol.
                                                                      • an owner
5      Log Sharing and Privacy Settings                               • a set of rules allowing or denying access to an account
To gain acceptance for Nabu and server-side logging in gen-             or a group of accounts
eral, it is important to ensure the user’s privacy. This means         The name is an arbitrary string without spaces, e.g., de-
that Nabu must                                                      fault, friends, workGroup. In the requests for policy manage-
    1. Leave the user in full control over what is logged.          ment the name is used to identify the policy. Thus the policy
                                                                    name must be unique for a user (but of course two users can
    2. Allow users to delete sensitive information at any time.     use the same name without conflicts).
    3. Allow access to the archive only through a clearly de-          The owner is the account that owns the policy. The owner
       fined interface that handles authentication and respects     can edit the policy and add or removes rules. The policy al-
       the privacy settings.                                        ways implicitely grants access to the owner, so the owner can
                                                                    access his own messages even if the rules would deny it. It
    4. Implement conservative default settings (i.e., disable
                                                                    is only possible for a user to change the policy for a message
       logging, use restrictive privacy settings)
                                                                    when he owns the currently linked policy.
   On the other hand, one of Nabu’s goals is to encourage              The rules: A policy can contain any number of rules of the
sharing between peers to make valuable information avail-           form ”allowAccount “, ”denyAccount “, ”allowGroup “, ”denyGroup
needed that supports both ensuring privacy and allows shar-         “.
ing of conversation logs.                                              The rules are applied in (deny, allow) order. If access is not
   In Nabu, every user is the owner of the messages he has          explicitely allowed, it is denied. That is, a policy without any
sent, and he can control who can read his messages or delete        rules denies all accesses (except to the policy owner).
them later if he wants. This means that a message is under             If both rules exist that allow and deny access to an account,
control of the message sender only. If two users have a con-        the deny-rule takes precedence and the access is denied.
versation, each user is responsible for his own messages and
has no control over the messages he received from his dialog        5.2   Groups
partner.                                                            As mentioned before, access permissions can be set not only
   Access control is managed via privacy policies. A privacy        per user but also per group. A group is a plain set of accounts,
policy contains a set of rules that control read permissions by     set up by the user to make privacy management easier. For
allowing or denying access to certain accounts or groups of         example, a user could set up a group ”friends“, and add the
accounts. Every message logged has a link to a privacy pol-         accounts of his friends to this group. Instead of allowing ac-
icy that controls the access to the message, and every user         cess per account, he can do a simple ”allowGroup friends“
has a list of policies he can assign to logged messages. There      and all accounts in the group gain access.
is always exactly one policy active at any one time. When-
ever the user writes a message, Nabu logs the message and           5.3   Examples
links it to the currently active policy. Policies are linked, not   Here we present some examples demonstrating how privacy
copied: For instance, if the user adds a new account to his         policies can be applied.
policy ”friendsOnly“, the added account gains access to all            There are five accounts, Alice, Bob, Charlie, Daniel and
archived messages that already use the ”friendsOnly“ policy.        Emily. Alice is the owner of the policies, and she created
   What does it actually mean that a resource is not accessi-       a group friends with Bob and Emily in it. Note that in the
ble? If a message (or any resource in general) is not accessi-      implementation, full account URIs are used, but we use Alice
ble, this means the resource itself and its concise bounded de-     instead of http://foo/Accounts/jabber.foo.org/alice for clarity
scription14 is completely hidden from the user: The resource        here.
itself and all links to or from the resource are hidden. When          The following policy allows access to Alice (as she is the
querying the model, the resource does not show up in the re-        owner), Daniel and Bob.
sults. For messages this means that neither the message con-        policyOwner Alice
tent nor any links to the message are visible. This includes        allowAccount Daniel
annotations: If an annotation was added to link the message         allowAccount Bob
to a category, this statement is not visible. This is impor-
tant, because we do not want other users to read the topics            The following policy allows access to Alice as she is the
                                                                    owner, friends group, which is Bob and Emily, and Charlie.
    13 http://dublincore.org/documents/dces/                        So just poor Daniel may not read the resource (nor can the
    14 http://sw.nokia.com/uriqa/CBD.html                           rest of the world).
                                                                       and presence logging are enabled or not. Both default to
                                                                       false.
                                                                     • internal:policies link to the privacy policies owned by
                                                                       the account.
                                                                     • internal:activePolicy links to the currently active policy.
                                                                       This policy is attached when a presence change or mes-
                                                                       sage is logged.
                                                                     • internal:lastPresence links to the last logged presence
                                                                       change of the respective account. This makes it fast and
                 Figure 6: A Privacy Policy.                           easy for the logger to find the last presence and link new
                                                                       presences to it via the nabu:previousPresenceChange
                                                                       property.

                                                                 6     Querying the Archive
                                                                 Once Nabu logs a user’s conversations, the user proba-
                                                                 bly wants to search them at some point. For query-
                                                                 ing the archive, Nabu uses the SPARQL query lan-
                                                                 guage [Eric Prud’hommeaux, 2005]. SPARQL is a language
                                                                 for querying RDF stores, similar to SQL. Being powerful and
                                                                 versatile, it allows arbitrarily complex queries. Unfortunately
                                                                 it’s also quite complex for everyday use, so a GUI for the
       Figure 7: RDF representing Account Settings.              most common queries would be desirable.
                                                                    Example: One wants to search for all messages containing
policyOwner Alice                                                ”Nabu“. The following command performs this search:
allowGroup friends                                               QUERY SPARQL
allowAccount Charlie                                             DESCRIBE ?msg
                                                                 WHERE { ?msg nabu:body ?body .
   In the following example, the first directive allows access     FILTER REGEX(?body, "Nabu", "i") }
to the friends group, i.e. Bob and Emily, but as the second
directive denies access to Bob explicitely, only Emily has ac-      The query returns all messages ?msg that have a body
cess (and Alice of course).                                      ?body matching the regular expression ”Nabu” (”i” makes the
                                                                 search case-insensitive). For simple string searches, there is
policyName friendsWithoutBob
policyOwner Alice
                                                                 also a shortcut available in Nabu in the form of the ”QUERY
allowGroup friends                                               SEARCHMSG“ command.
denyAccount Bob                                                     Nabu will return the messages matching the query as
                                                                 RDF/XML. For instance, Nabu might return one message,
   Internally, privacy policies are realized using the the RDF   containing ”Me thinks, Nabu rocks big time“:
shown in Figure 6. The parts of the RDF shown in black can
be queried using Nabu’s query features. The other parts can      210 
   This graph shows the last policy example. The policy             Me thinks, Nabu rocks big time!
friendsWithoutBob is owned by Alice. It allows access to her          
to a message someMessage which was sent by Alice. While               
                                                                       
the privacy policy itself and all properties like allowGroup,          chat
denyAccount and hasPolicy are stored in the internal model.            
                                                                       
                                                                       
nal model contains a corresponding AccountSettings instance            2005-07-14T...
saving settings related to this account.                           
   This example graph in Figure 7 shows the AccountSettings      
instance of alice@jabber.foo.org. It has the following proper-
                                                                    Note: Nabu returns only RDF data that has been declared
ties:
                                                                 as accessible. By default, this includes all messages that have
  • internal:messageLoggingEnabled     and      inter-           been sent or received by the user who issues the query. Nor-
    nal:presenceLoggingEnabled: Store whether message            mally he won’t see messages exchanged between other users.
If a user decides to, he can grant others access to a conversa-          • A powerful message search platform – Using SPARQL,
tion log (for example, co-workers might decide to share the                the archive supports both fulltext search and semantic
log of an online meeting with other team members).                         search exploiting the relations specified in the message’s
                                                                           annotations.
7      User Observation                                                  • An exchange platform for information – As a user’s
One topic addressed in the research project EPOS [Dengel et                Nabu repository features fine-grained access control,
al., 2002] is user observation: By observing the user’s ac-                other users may be granted access to a user’s messages
tions, EPOS tries to identify the context of the desktop in                and message annotations, extending the other user’s
order to support the user in his work [Schwarz, 2005]. De-                 knowledge repository.
pending on the current context of a user, different contacts,            • A gateway for integrating instant messages to your per-
files or other resources are relevant. For instance, the context           sonal information model – Nabu enables any RDF-
information can be used to present currently relevant contacts             capable software to access the user’s instant messages.
from the addressbook to the user. EPOS implements this us-                 This way, instant messages can be integrated into the
ing an assistant bar, which is a desktop panel listing relevant            user’s personal information model in frameworks such
contacts, resources, and projects.                                         as Gnowsis [Sauermann and Schwarz, 2004].
   Collecting observation data is done by plugins for the
user’s applications, e.g., word processors, WWW browsers                 • A personal semantic knowledge base – Nabu is not only
or mail clients. Each plugin observes the user’s actions in                about instant messages. It can store anything that may
the respective application and sends them to a central context             be represented in RDF. Together with Nabu access con-
elicitation component. Nabu offers this functionality for in-              trol and the access mechanisms provided by the XMPP
stant messaging, notifying messages from or to the observed                protocol, a simple but powerful personal shareable se-
user to EPOS.                                                              mantic knowledge base arises. As Nabu is intended to
   It is important to note that in Nabu, user observation is fully         run on a server that is continuously available, this solves
controlled by the observed user. It must be activated by the               problems with spurious availability of data in case the
user and can be stopped at any time. Observing applications                repository is implemented on the user’s machine. Also,
need the observed password of the account in order to register             most technical problems related to reachability due to
at the server.                                                             firewalls or network address translation scenarios do not
   Usually observation will be integrated into the context                 occur in this approach.
framework by using the client API that comes with Nabu.
   The observation works as follows: To observe messages             9      Conclusion and Further Work
from and to Alice (alice@myserver.org), the observer pro-
                                                                     The Nabu project is an attempt to bring the Semantic Web and
gram logs in at the server as alice@myserver.org, like the
                                                                     instant messaging together, making the increasing amount of
user does with her graphical client. The observer program
                                                                     information exchanged via instant messaging accessible us-
must use its own resource, e.g., ’observation’. To start the
                                                                     ing Semantic Web technology.
observation, the program sends
                                                                        An ontology was developed to describe instant messaging
OBSERVEMESSAGES on observation                                       conversations. Using the RDF standard to represent the data
   to the server (to test observation, this can also be sent man-    and the promising SPARQL query language for user queries,
ually to the bot). This registers the resource ’observation’ as      Nabu integrates well into existing Semantic Web infrastruc-
observer. From now on all messages Alice sends or receives           tures. To make better use of the stored information, users can
are notified to the ’observation’ resource. A notification mes-      attach metadata to their logs.
sage consists of a subject, containing the URI of the noti-             A privacy model was developed to control the accessibil-
fied message, and the message body, containing the message           ity of RDF data, an area where no proven implementations or
CBD15 .                                                              standards yet exist. Working on resource-level, it is possible
                                                                     to control accessibility per resource. Although it has limita-
                                                                     tions when one needs more fine-grained control, like hiding
8      Applications                                                  only certain properties, it works well for Nabu.
Numerous applications can be realized with the techniques               The concepts were implemented as an extension for the
presented. Let us enumerate some of these.                           Jive XMPP server. This proof-of-concept implementation is
                                                                     available16 and can be used by interested people to integrate
    • A message archive – This is Nabu’s most obvious ap-            instant messaging and Semantic Web. In the DFKI KM work-
      plication. As Nabu is a server-side component, similar         ing group, it is already used in the EPOS [Dengel et al., 2002]
      to IMAP, where users may access their messages in the          project. The user observation functionality has been success-
      central archive from anywhere. Also, no inconsistencies        fully integrated into the context elicitation.
      can occur.
                                                                        Nabu is still a prototype. To make it suitable for wide-
    • A semantic store – As items stored in Nabu can be anno-        spread use, more effort and feedback is needed. The main
      tated, messages bear not only syntax but also semantics.       issues are:
    15 http://sw.nokia.com/uriqa/CBD.html#definition                     16 http://nabu.opendfki.de/
  • Nabu’s user interface is currently text-based. This is           Collaborative Learning, Working, Gaming and Beyond.
    flexible because it can be used on any platform and with         Submission to JabberConf Europe 2002, 2002.
    any client, but is neither convenient nor user-friendly.      [Eric Prud’hommeaux, 2005] Andy         Seaborne     (edts)
    Graphical frontends would be desirable, preferably inte-         Eric Prud’hommeaux.       Sparql query language for
    grated in client software (via plugins). Other options are       rdf. W3c working draft, W3C, 2005.
    a web frontend or integration in frameworks for desktop
    search.                                                       [Karneges and Paterson, 2004] Justin     Karneges      and
                                                                     Ian Paterson.         JEP-0136:      Message Archiv-
  • Evaluation is needed to find out whether the chosen pri-         ing.    Jabber Enhancement Proposal, 2004.         URL
    vacy model meets the user requirements. This needs ex-           http://www.jabber.org/jeps/jep-0136.html.
    perience from daily use of ”real users“, as different us-
    age patterns need different privacy models. At the mo-        [Quan et al., 2003] Dennis Quan, Karun Bakshi, and
    ment, there is always one policy active at a time. The ad-       David R. Karger. A unified abstraction for messaging on
    vantage is that it is easy to manage and clear which pol-        the semantic web. In WWW (Posters), 2003.
    icy is used for the current chat. It would also be possible   [Saint-Andre, 2002] Peter Saint-Andre. JEP-0045: Multi-
    to specify a policy for specific chats, e.g., ”everything        User Chat. Jabber Enhancement Proposal, 2002. URL
    I write in MUC room #workgroup should be readable                http://www.jabber.org/jeps/jep-0045.html.
    by the whole workgroup“. While this is more power-            [Sauermann and Schwarz, 2004] Leo Sauermann and Sven
    ful, it has the disadvantage is that the user could forget
                                                                     Schwarz. Introducing the gnowsis semantic desktop. In
    about the channel-specific setting and share information
                                                                     Proceedings of the International Semantic Web Confer-
    with more people than intended. A third option would be
                                                                     ence 2004, 2004.
    to always use restrictive privacy settings when logging
    (i.e., only participants can read messages). Users would      [Schwarz, 2005] Sven Schwarz. A Context Model for Per-
    manually share the log afterwards by marking the con-            sonal Knowledge Management. In Proceedings of the IJ-
    versation in their client plugin and assigning a less re-        CAII’05 Workshop on Modeling and Retrieval of Context,
    strictive policy. This usage pattern is already supported,       Edinburgh, 2005.
    the user must just leave the default policy active, and
    assign other custom policies to logged messages.
  • Currently Nabu is only accessible via the XMPP proto-
    col. In order to make the repository available to software
    without requiring an XMPP library, it should be made
    possible to query the archive using HTTP(S)/XML–
    RPC/SOAP protocols.

Acknowledgments
This work has been supported by a grant from The Federal
Ministry of Education, Science, Research, and Technology
(FKZ ITW–01 IW C01).

References
[Andy Seaborne et al., 2005] Andy Seaborne et al. Jena Se-
   mantic Web Framework, 2005.
[Berners-Lee et al., 2001] Tim Berners-Lee, James Hendler,
   and Ora Lassila. The Semantic Web. Scientific American,
   284(5):34–43, 2001.
[Dengel et al., 2002] Andreas Dengel, Andreas Abecker,
   Jan-Thies Bähr, Ansgar Bernardi, Peter Dannenmann,
   Ludger van Elst, Stefan Klink, Heiko Maus, Sven
   Schwarz, and Michael Sintek. Evolving Personal to Or-
   ganizational Knowledge Spaces. Project Proposal, DFKI
   GmbH Kaiserslautern, 2002.
[Dennis Quan and Karger, 2003] David Huynh Dennis Quan
   and David R. Karger. Haystack: A platform for author-
   ing end user semantic web applications. In International
   Semantic Web Conference, pages 738–753, 2003.
[Eisenstadt and Dzbor, 2002] Marc Eisenstadt and Martin
   Dzbor. BuddySpace: Enhanced Presence Management for