<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Nabu - A Semantic Archive for XMPP Instant Messaging</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Frank Osterfeld</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Malte Kiesel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bldg.</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kaiserslautern</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Germany</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>frank.osterfeld</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>malte.kiesel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>sven.schwarz}@dfki.de</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Instant messaging (IM) has become more and more common these days, and is complementing email and other means of electronic communication. However, due to its heavily context-dependent nature, searching archives of instant messages using only full text search is a tedious task. Also, in contrast to mails, files, and other electronic media, instant messages typically do not feature a unique identifier or location, making it difficult to reference a particular instant messaging conversation. Nabu is a semantic archive for XMPP instant messaging designed to address these problems by implementing a semantic message store, using RDF(S) as its storage format. It is implemented as a server module and will log messages, manage access control to the archives on a per-user basis, and allow other components to observe and annotate messages.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The importance of instant messaging (IM) for private and
organizational communication has increased over the last years.
IM, the instant sending and receiving of (mostly short) text
messages between two or more users, complemented by a list
of peer contacts along with their online status, has become
one of the most used communication channels on the
internet, and more and more valuable information is exchanged
via instant messages, especially among colleagues at work.</p>
      <p>Despite of the increasing amount of information
exchanged, IM client support for archiving and searching the
messages exchanged is poor. This is understandable, as on
the one hand, most IM client applications are intended for
private users for whom other features are more important.
On the other hand, IM messages are typically very short and
heavily tied to their particular context, thus making efforts to
organize the archive of exchanged messages a lot more
difficult than it is the case with other means of communication,
such as e-mails, where the text is essentially self–contained.
Moreover, e-mails come with a variety of additional
information such as a subject or thread references which are usually
missing in instant messages.</p>
      <p>While e-mail can be archived in a long-term manner on
server-side using the IMAP standard, there is no standard for
archiving IM conversations. Chat logs are mostly stored
locally on the client machine, using proprietary file formats.
This has several disadvantages: storing the archive locally on
the client computers is inconvenient when using more than
one computer, archives are spread over different installations,
and they quickly become out of sync. In addition,
information gets lost easily. Using proprietary, client- and
protocolspecific formats to store the information complicates
managing and searching the stored information using other
interfaces than the client UI.</p>
      <p>In this paper we present Nabu1, an open–source system
providing server-side logging of instant messages. Nabu is
implemented for the XML-based Jabber/XMPP protocol2.
Unlike other proprietary IM protocols from major providers
such as Yahoo!, MSN or AOL, Jabber/XMPP is an open
standard. Most server and client software is available under open
source licenses, which makes it possible to add Nabu’s
features as a plugin for an existing server implementation. The
Jive Messenger XMPP server3 was chosen due to its
welldesigned and well-documented code base and easy
extensibility.</p>
      <p>Nabu tries to integrate instant messaging into the efforts
made in the Semantic Web [Berners-Lee et al., 2001]
community to store and retrieve information in a unified way. It uses
the Semantic Web standard RDF4 to describe the stored
information on the server. For retrieving the stored information, it
supports the SPARQL query language [Eric Prud’hommeaux,
2005], which is currently going through the standardization
process at the W3C.</p>
      <p>Using XMPP as transport protocol for SPARQL queries
and commands has several benefits. For example, XMPP
takes care of authentication and encryption; also, XMPP uses
a persistent connection, delivering higher performance than
protocols that use non-persistent connections such as HTTP,
which is used as transport protocol by XML–RPC and SOAP.</p>
      <p>In addition to the logging of chat messages, further features
of Nabu are:
1http://nabu.opendfki.de/
2http://www.jabber.org/
3http://www.jivesoftware.org/messenger/
4http://www.w3.org/RDF/
• Users can add further metadata to the logged messages
by adding their own RDF statements. That way
information can be categorized and structured, making retrieval
of relevant information easier.
• Nabu supports sharing of logged messages between
users, e.g., making a conference log available to the
other group members. Privacy is ensured by a strict
privacy model, restricting access to explicitely authorized
users.
• Nabu integrates instant messaging into the context
elicitation framework of EPOS [Schwarz, 2005],
sending message notifications to the EPOS user observation
(when enabled by the user). Other applications can also
receive these events by registering with the Nabu
component.</p>
      <p>The rest of this paper discusses related work, Nabu’s
architecture, the RDF schemes used, RDF access control
mechanisms, Nabu’s observation feature, and possible applications,
followed by conclusions.</p>
      <p>Related Work
[Karneges and Paterson, 2004] proposed a storage format and
protocol for server-side message archives as a Jabber
protocol enhancement. The proposal suggests a simple protocol
and storage format for message archiving. It defines its own
format and does not use existing standards for message
storage and retrieval apart from XML.</p>
      <p>The Haystack project [Dennis Quan and Karger, 2003]
builds a client for information management by integrating
various information sources into one frontend, using an
infrastructure based on RDF. A messaging model was
developed [Quan et al., 2003] to represent conversations from
various communication channels, such as e-mail, news groups
and instant messaging, in a unified way. In contrast to Nabu,
Haystack is client-based.</p>
      <p>The BuddySpace research project [Eisenstadt and Dzbor,
2002] extends the presence concept in Jabber (simple
offline/online/busy states), and adds information such as
geographical location, current work focus etc. Furthermore, it
investigates how such additional semantics can be used to
facilitate collaboration over networks. The BuddySpace Jabber
client5 demonstrates the concepts.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Architecture</title>
      <p>Nabu is implemented as a plugin for the Jive Messenger
XMPP server6. By implementing Nabu as a server–side
component, every user of the Nabu–enabled server can use its
services without requiring the installation of a client–side plugin.
Since there are dozens of XMPP clients7, this was clearly the
optimal solution. Also, sharing annotations would be much
more difficult with a client–side implementation since clients
cannot be expected to be online at all times. Finally,
moving complexity to the server side nicely fits into the XMPP
philosophy.</p>
      <sec id="sec-2-1">
        <title>5http://buddyspace.sourceforge.net/ 6http://www.jivesoftware.com/ 7In our department, at least four different clients are in use.</title>
        <p>The graph shown in Figure 1 describes the top-level
components of Nabu.</p>
        <p>The central component of Nabu is the Archive. The Archive
contains the RDF model, consisting of the public model that
stores the conversation logs which can be accessed from
outside, and an internal model, managing internal configuration
data and privacy policies. The Nabu implementation uses
Jena [Andy Seaborne et al., 2005] for RDF handling. Models
are stored persistently in a database. The database backend is
fully encapsulated by Jena.</p>
        <p>The Archive is accessed in two ways:
• Logging: the plugin intercepts XMPP messages,
converts them to RDF and stores them in Archive. This
is done by the Logger component. The logger
component simply takes the message, checks whether logging
is enabled and if so, adds the RDF message to the RDF
graph. Message URIs are created using the address of
the XMPP server, ensuring uniqueness.
• User requests: the RequestExecutor interface allows the
user to, for example, search the RDF graph. It takes
parsed requests in the form of request objects, executes
them, and returns a response object.</p>
        <p>The Nabu Bot component is the interface between the users
and the plugin. It parses user requests, creates request objects
and passes them to the RequestExecutor. It takes the returned
responses, encodes them in a string and sends them back to
the requestors.</p>
        <p>The Nabu Bot uses the Nabu protocol for transferring user
queries and answers. The commands of the Nabu protocol are
encapsulated as the body of chat messages. Other bindings to
the XMPP protocol are possible8 – also implementation of
such a binding is quite straightforward.</p>
        <p>In the following section, we take a look at the RDF
schemas used by Nabu.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The Nabu Ontology</title>
      <p>This section explains the most important parts of the
ontology, i.e., the RDF schema, Nabu uses for logging. It
covers the most important classes and properties for representing
messages, accounts, and presence changes in RDF. Nabu
handles two types of data:
• The actual RDF data that can be queried externally, i.e.</p>
      <p>logged messages, presence changes and annotations.
8For example, Dan Brickley’s foaftown proposes an XMPP binding for
SPARQL.
• Internal data, like account settings (e.g., logging
enabled/disabled), and privacy policies. This data cannot
be queried from the outside, and the users will never see
the RDF representations. It is only indirectly accessible
through the requests defined in the Nabu protocol.</p>
      <p>In the following sections, the RDF schema of the data that
can be queried externally will be discussed. The schemas of
the data that cannot be directly queried will be presented in
section 5.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Message</title>
      <p>The most important class in the Nabu ontology is Message,
shown in Figure 2.</p>
      <p>• nabu:body is a literal containing the message text.
• nabu:datetime is the time stamp added by Nabu when
logging the message. The format is xsd:datetime9. Note
that if the sender and receiver are on different servers
and each server has Nabu installed the time stamp will
be different, so identical messages cannot be matched
using the timestamp.
• nabu:sender: Links to the account that sent the message.
• nabu:receivers: The accounts that received the message.</p>
      <p>In an one-to-one chat, this is a single account, the chat
partner. In multi user chat (MUC), these are all accounts
that received the message, i.e. the accounts that were in
the MUC room when the message was sent. Note that
the temporary nick names users have in a MUC room
are ignored; anonymous MUC rooms are not supported.
Nick names are resolved to the corresponding accounts.
Also note that the resource part of the participant’s
Jabber ID is omitted in both one-to-one and MUC logs.
• nabu:inRoom: The room the message was sent in. For
details how rooms are defined see below.
• nabu:previousMessageInRoom: Links to the previous
message in the room. This is useful for tracking
conversations and for exploring logged conversations with
a specific chat partner over time.</p>
      <p>Unfortunately, in multi-user chat rooms it is difficult to
track what message(s) a user is replying to. In practice, most
users prefix their messages with a string denoting the receiver
in chatrooms (e.g., ”Frank: Please refrain from doing this.“).
However, Nabu does not address this issue, as other
components can add annotation as needed using an heuristic.</p>
      <p>File transfers and other activities such as video or voice
chat are currently not supported. Extending Nabu to
implement this functionality and subclassing the Message class
accordingly should be trivial.
The Account class represents a user account, as shown in
Figure 3. Every user account is uniquely represented by a
Jabber ID, like ”alice@jabber.foo.org“. For privacy reasons,
Nabu does not store person–account associations. If such
a mapping is required, one may store such information
using FOAF10, which already includes a foaf:jabberID property
which allows linking a FOAF:Person to a Jabber ID.</p>
      <p>Every account has a Jabber ID representing the account.
However, not every Jabber ID represents an account (see
MUC rooms), so Jabber IDs and accounts are not identical.
A room is a virtual place where two or more users meet and
chat with each other. Every message has one room
associated, and messages in a room are linked to make conversation
tracking easier. Two types of rooms exist, depending on the
chat type:</p>
      <p>In one-to-one chats, the room is defined by the two
persons chatting: If Alice chats with Bob, all messages sent by
Alice to Bob and vice-versa are in the ”Alice-Bob-Room“.
The nabu:previousMessageInRoom property links all
messages sent between Alice and Bob, making it easy for Alice to
navigate through all logged messages she sent to or received
from Bob. In the Nabu ontology, this kind of room is called
P2PRoom (Point-to-Point-Room). In RDF, the
”Alice-BobRoom“ might look like this:
&lt;P2PRoom rdf:about="&amp;foo;P2PRoom-alice-bob"&gt;
&lt;members rdf:resource="&amp;foo;Account-alice"/&gt;
&lt;members rdf:resource="&amp;foo;Account-bob"/&gt;
&lt;/P2PRoom&gt;</p>
      <p>In multi-user chat, the semantics of a room are slightly
different. While the P2PRoom is a Nabu concept and does
not exist in Jabber, the MUC protocol as defined in
JEP0045 [Saint-Andre, 2002] introduces the concept of rooms.
A room has its own Jabber ID, just like accounts, e.g.,
support@conf.foo.org, so unlike P2PRooms, these
”MUCRooms“ are not defined by their members, but by the room
name and topic.
9http://www.w3.org/TR/xmlschema-2/datatypes.html#dateTime
10http://www.foaf-project.org/</p>
      <p>• nabu:status: The new presence status.
• nabu:statusMessage: The status message the user set.
• nabu:account: The account that changed its presence
status.
• nabu:previousPresenceChange: The last logged
presence change of the account nabu:account. All logged
presences of a user are chronologically linked via the
nabu:previousPresenceChange property.
4</p>
    </sec>
    <sec id="sec-5">
      <title>Annotations</title>
      <p>Nabu enables users to add their own statements to the RDF
store. This makes it possible for users to add metadata to
logged messages and share this metadata with their peers.
For example, a user could set up a set of categories to file
his conversations to facilitate later searching. He could do
this manually or could use a text classifier and categorize
automatically. Since instant messages are heavily dependent on
their context (for example, imagine a user receiving an e-mail
with a question and answering via instant messaging), one
may also decide to use annotations to link the messages to
their context. We will discuss this in sections 7 and 8.</p>
      <p>The user can add any statement he likes (except for
statements from the Nabu schema, see below), but it is a good
practice to reuse commonly used ontologies. Widespread
vocabularies for metadata and categorization are Dublin Core11
and SKOS12. We have to stress that Nabu does not restrict
11http://dublincore.org/
12http://www.w3.org/2004/02/skos/
the user to annotating messages with concepts – it is
perfectly possible to link arbitrary RDF constructs to chat
messages. This way Nabu is flexible enough to allow tagging
messages with context information such as “The user was
looking at the DFKI website when sending this message”.
Using Nabu’s observation features (see section 7), software
running on the user’s machine can automatically annotate
new messages when they arrive with information the
annotation software has access to. Also, it is possible to create
semantic links between messages: for example, it is
possible to implement a more complex heuristic for determining
the reply–chain of a message (see section 3.1 for an
explanation on why this is not trivial). Adding this information
to messages does not require extending Nabu – one can also
write a client–side component that uses message annotations
for adding this information instead.</p>
      <p>As an example of a simple manual annotation, let us
classify a (fictional) message: ”Hi Frank, I have yet another great
feature for Nabu you could implement“. To make searching
easier, we want to specify the project the message is related
to, in this case Nabu.</p>
      <p>The CREATESTATEMENT request adds annotations to a
message:
CREATESTATEMENT RESOURCE
http://&amp;foo;Message-101
http://purl.org/dc/terms/subject
http://foo/Categories/Projects/Nabu</p>
      <p>The first argument must be one of RESOURCE or
LITERAL and indicates whether the object of the statement
should be handled as resource URI or as literal string. The
following tokens are the (subject, predicate, object) triple
representing the statement shown in Figure 5.</p>
      <p>The annotations are reified, which means that each
statement itself becomes a resource that is linked to subject and
object. That makes it possible to add properties to the
statement. In Nabu, every user-added statement has a property
statedBy, linking the creator of the statement. In our
example, this is frank@jabber.foo.org. The statement is ”owned“
by the linked account. Only this account can delete the
statement. Also, users can read the statedBy property and decide
whether they trust the statement or not. Alice might decide
that annotations made by Charlie are useful and take them
into consideration, but ignore Bob’s statements.</p>
      <p>Nearly every kind of RDF statement can be added. The
only restriction is that properties from the Nabu ontology are
not allowed for user statements. E.g., dc:subject (dc = Dublin
Core13) is valid, but nabu:isInRoom is not, because the
predicate nabu:isInRoom is part of the Nabu ontology. This
prevents users from currupting (deliberately or not) the Nabu
archive or compromising privacy settings. Properties from
the Nabu ontology are managed by the server and can only be
modified indirectly by commands of the Nabu protocol.
5</p>
    </sec>
    <sec id="sec-6">
      <title>Log Sharing and Privacy Settings</title>
      <p>To gain acceptance for Nabu and server-side logging in
general, it is important to ensure the user’s privacy. This means
that Nabu must
1. Leave the user in full control over what is logged.
2. Allow users to delete sensitive information at any time.
3. Allow access to the archive only through a clearly
defined interface that handles authentication and respects
the privacy settings.
4. Implement conservative default settings (i.e., disable
logging, use restrictive privacy settings)</p>
      <p>On the other hand, one of Nabu’s goals is to encourage
sharing between peers to make valuable information
available to others when wanted. Therefore a privacy model is
needed that supports both ensuring privacy and allows
sharing of conversation logs.</p>
      <p>In Nabu, every user is the owner of the messages he has
sent, and he can control who can read his messages or delete
them later if he wants. This means that a message is under
control of the message sender only. If two users have a
conversation, each user is responsible for his own messages and
has no control over the messages he received from his dialog
partner.</p>
      <p>Access control is managed via privacy policies. A privacy
policy contains a set of rules that control read permissions by
allowing or denying access to certain accounts or groups of
accounts. Every message logged has a link to a privacy
policy that controls the access to the message, and every user
has a list of policies he can assign to logged messages. There
is always exactly one policy active at any one time.
Whenever the user writes a message, Nabu logs the message and
links it to the currently active policy. Policies are linked, not
copied: For instance, if the user adds a new account to his
policy ”friendsOnly“, the added account gains access to all
archived messages that already use the ”friendsOnly“ policy.</p>
      <p>What does it actually mean that a resource is not
accessible? If a message (or any resource in general) is not
accessible, this means the resource itself and its concise bounded
description14 is completely hidden from the user: The resource
itself and all links to or from the resource are hidden. When
querying the model, the resource does not show up in the
results. For messages this means that neither the message
content nor any links to the message are visible. This includes
annotations: If an annotation was added to link the message
to a category, this statement is not visible. This is
important, because we do not want other users to read the topics
13http://dublincore.org/documents/dces/
14http://sw.nokia.com/uriqa/CBD.html
we were talking about, even if they cannot read the actual
message content.
5.1</p>
    </sec>
    <sec id="sec-7">
      <title>Privacy Policies in detail</title>
      <sec id="sec-7-1">
        <title>Every privacy policy has</title>
        <p>• a name
• an owner
• a set of rules allowing or denying access to an account
or a group of accounts</p>
        <p>The name is an arbitrary string without spaces, e.g.,
default, friends, workGroup. In the requests for policy
management the name is used to identify the policy. Thus the policy
name must be unique for a user (but of course two users can
use the same name without conflicts).</p>
        <p>The owner is the account that owns the policy. The owner
can edit the policy and add or removes rules. The policy
always implicitely grants access to the owner, so the owner can
access his own messages even if the rules would deny it. It
is only possible for a user to change the policy for a message
when he owns the currently linked policy.</p>
        <p>The rules: A policy can contain any number of rules of the
form ”allowAccount &lt;accountURI&gt;“, ”denyAccount
&lt;accountURI&gt;“, ”allowGroup &lt;groupName&gt;“, ”denyGroup
&lt;groupName&gt;“.</p>
        <p>The rules are applied in (deny, allow) order. If access is not
explicitely allowed, it is denied. That is, a policy without any
rules denies all accesses (except to the policy owner).</p>
        <p>If both rules exist that allow and deny access to an account,
the deny-rule takes precedence and the access is denied.
5.2</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Groups</title>
      <p>As mentioned before, access permissions can be set not only
per user but also per group. A group is a plain set of accounts,
set up by the user to make privacy management easier. For
example, a user could set up a group ”friends“, and add the
accounts of his friends to this group. Instead of allowing
access per account, he can do a simple ”allowGroup friends“
and all accounts in the group gain access.
5.3</p>
    </sec>
    <sec id="sec-9">
      <title>Examples</title>
      <p>Here we present some examples demonstrating how privacy
policies can be applied.</p>
      <p>There are five accounts, Alice, Bob, Charlie, Daniel and
Emily. Alice is the owner of the policies, and she created
a group friends with Bob and Emily in it. Note that in the
implementation, full account URIs are used, but we use Alice
instead of http://foo/Accounts/jabber.foo.org/alice for clarity
here.</p>
      <p>The following policy allows access to Alice (as she is the
owner), Daniel and Bob.
policyOwner Alice
allowAccount Daniel
allowAccount Bob</p>
      <p>The following policy allows access to Alice as she is the
owner, friends group, which is Bob and Emily, and Charlie.
So just poor Daniel may not read the resource (nor can the
rest of the world).</p>
      <p>In the following example, the first directive allows access
to the friends group, i.e. Bob and Emily, but as the second
directive denies access to Bob explicitely, only Emily has
access (and Alice of course).
policyName friendsWithoutBob
policyOwner Alice
allowGroup friends
denyAccount Bob</p>
      <p>Internally, privacy policies are realized using the the RDF
shown in Figure 6. The parts of the RDF shown in black can
be queried using Nabu’s query features. The other parts can
only be retrieved and manipulated using Nabu commands.</p>
      <p>This graph shows the last policy example. The policy
friendsWithoutBob is owned by Alice. It allows access to her
friends group, but denies it for Bob. The policy is attached
to a message someMessage which was sent by Alice. While
messages, groups, and accounts are part of the public model,
the privacy policy itself and all properties like allowGroup,
denyAccount and hasPolicy are stored in the internal model.
5.4</p>
    </sec>
    <sec id="sec-10">
      <title>Account Settings</title>
      <p>For every account stored in the public model, the
internal model contains a corresponding AccountSettings instance
saving settings related to this account.</p>
      <p>This example graph in Figure 7 shows the AccountSettings
instance of alice@jabber.foo.org. It has the following
properties:
• internal:messageLoggingEnabled and
internal:presenceLoggingEnabled: Store whether message
and presence logging are enabled or not. Both default to
false.
• internal:policies link to the privacy policies owned by
the account.
• internal:activePolicy links to the currently active policy.</p>
      <p>This policy is attached when a presence change or
message is logged.
• internal:lastPresence links to the last logged presence
change of the respective account. This makes it fast and
easy for the logger to find the last presence and link new
presences to it via the nabu:previousPresenceChange
property.</p>
    </sec>
    <sec id="sec-11">
      <title>6 Querying the Archive</title>
      <p>Once Nabu logs a user’s conversations, the user
probably wants to search them at some point. For
querying the archive, Nabu uses the SPARQL query
language [Eric Prud’hommeaux, 2005]. SPARQL is a language
for querying RDF stores, similar to SQL. Being powerful and
versatile, it allows arbitrarily complex queries. Unfortunately
it’s also quite complex for everyday use, so a GUI for the
most common queries would be desirable.</p>
      <p>Example: One wants to search for all messages containing
”Nabu“. The following command performs this search:
QUERY SPARQL
DESCRIBE ?msg
WHERE { ?msg nabu:body ?body .</p>
      <p>FILTER REGEX(?body, "Nabu", "i") }</p>
      <p>The query returns all messages ?msg that have a body
?body matching the regular expression ”Nabu” (”i” makes the
search case-insensitive). For simple string searches, there is
also a shortcut available in Nabu in the form of the ”QUERY
SEARCHMSG“ command.</p>
      <p>Nabu will return the messages matching the query as
RDF/XML. For instance, Nabu might return one message,
containing ”Me thinks, Nabu rocks big time“:
210 &lt;rdf:RDF xmlns:rdf=...</p>
      <p>&lt;Message rdf:about=</p>
      <p>"&amp;foo;Message-094210.520"&gt;
&lt;body&gt;Me thinks, Nabu rocks big time!&lt;/body&gt;
&lt;previousMessageInRoom rdf:resource=</p>
      <p>"&amp;foo;Message-143802.712"/&gt;
&lt;inRoom rdf:resource=
"&amp;foo;P2PRoom-frank2/"/&gt;
&lt;subject/&gt;
&lt;messageType&gt;chat&lt;/messageType&gt;
&lt;streamID/&gt;
&lt;sender rdf:resource=</p>
      <p>"&amp;foo;Account-frank2"/&gt;
&lt;receivers rdf:resource=</p>
      <p>"&amp;foo;Account-frank"/&gt;
&lt;datetime&gt;2005-07-14T...&lt;/datetime&gt;
&lt;/Message&gt;
&lt;/rdf:RDF&gt;</p>
      <p>Note: Nabu returns only RDF data that has been declared
as accessible. By default, this includes all messages that have
been sent or received by the user who issues the query.
Normally he won’t see messages exchanged between other users.
If a user decides to, he can grant others access to a
conversation log (for example, co-workers might decide to share the
log of an online meeting with other team members).
7</p>
    </sec>
    <sec id="sec-12">
      <title>User Observation</title>
      <p>One topic addressed in the research project EPOS [Dengel et
al., 2002] is user observation: By observing the user’s
actions, EPOS tries to identify the context of the desktop in
order to support the user in his work [Schwarz, 2005].
Depending on the current context of a user, different contacts,
files or other resources are relevant. For instance, the context
information can be used to present currently relevant contacts
from the addressbook to the user. EPOS implements this
using an assistant bar, which is a desktop panel listing relevant
contacts, resources, and projects.</p>
      <p>Collecting observation data is done by plugins for the
user’s applications, e.g., word processors, WWW browsers
or mail clients. Each plugin observes the user’s actions in
the respective application and sends them to a central context
elicitation component. Nabu offers this functionality for
instant messaging, notifying messages from or to the observed
user to EPOS.</p>
      <p>It is important to note that in Nabu, user observation is fully
controlled by the observed user. It must be activated by the
user and can be stopped at any time. Observing applications
need the observed password of the account in order to register
at the server.</p>
      <p>Usually observation will be integrated into the context
framework by using the client API that comes with Nabu.</p>
      <p>The observation works as follows: To observe messages
from and to Alice (alice@myserver.org), the observer
program logs in at the server as alice@myserver.org, like the
user does with her graphical client. The observer program
must use its own resource, e.g., ’observation’. To start the
observation, the program sends
OBSERVEMESSAGES on observation</p>
      <p>to the server (to test observation, this can also be sent
manually to the bot). This registers the resource ’observation’ as
observer. From now on all messages Alice sends or receives
are notified to the ’observation’ resource. A notification
message consists of a subject, containing the URI of the
notified message, and the message body, containing the message
CBD15.
8</p>
    </sec>
    <sec id="sec-13">
      <title>Applications</title>
      <p>Numerous applications can be realized with the techniques
presented. Let us enumerate some of these.</p>
      <p>• A message archive – This is Nabu’s most obvious
application. As Nabu is a server-side component, similar
to IMAP, where users may access their messages in the
central archive from anywhere. Also, no inconsistencies
can occur.
• A semantic store – As items stored in Nabu can be
annotated, messages bear not only syntax but also semantics.
• A powerful message search platform – Using SPARQL,
the archive supports both fulltext search and semantic
search exploiting the relations specified in the message’s
annotations.
• An exchange platform for information – As a user’s
Nabu repository features fine-grained access control,
other users may be granted access to a user’s messages
and message annotations, extending the other user’s
knowledge repository.
• A gateway for integrating instant messages to your
personal information model – Nabu enables any
RDFcapable software to access the user’s instant messages.
This way, instant messages can be integrated into the
user’s personal information model in frameworks such
as Gnowsis [Sauermann and Schwarz, 2004].
• A personal semantic knowledge base – Nabu is not only
about instant messages. It can store anything that may
be represented in RDF. Together with Nabu access
control and the access mechanisms provided by the XMPP
protocol, a simple but powerful personal shareable
semantic knowledge base arises. As Nabu is intended to
run on a server that is continuously available, this solves
problems with spurious availability of data in case the
repository is implemented on the user’s machine. Also,
most technical problems related to reachability due to
firewalls or network address translation scenarios do not
occur in this approach.
9</p>
    </sec>
    <sec id="sec-14">
      <title>Conclusion and Further Work</title>
      <p>The Nabu project is an attempt to bring the Semantic Web and
instant messaging together, making the increasing amount of
information exchanged via instant messaging accessible
using Semantic Web technology.</p>
      <p>An ontology was developed to describe instant messaging
conversations. Using the RDF standard to represent the data
and the promising SPARQL query language for user queries,
Nabu integrates well into existing Semantic Web
infrastructures. To make better use of the stored information, users can
attach metadata to their logs.</p>
      <p>A privacy model was developed to control the
accessibility of RDF data, an area where no proven implementations or
standards yet exist. Working on resource-level, it is possible
to control accessibility per resource. Although it has
limitations when one needs more fine-grained control, like hiding
only certain properties, it works well for Nabu.</p>
      <p>The concepts were implemented as an extension for the
Jive XMPP server. This proof-of-concept implementation is
available16 and can be used by interested people to integrate
instant messaging and Semantic Web. In the DFKI KM
working group, it is already used in the EPOS [Dengel et al., 2002]
project. The user observation functionality has been
successfully integrated into the context elicitation.</p>
      <p>Nabu is still a prototype. To make it suitable for
widespread use, more effort and feedback is needed. The main
issues are:
15http://sw.nokia.com/uriqa/CBD.html#definition
16http://nabu.opendfki.de/
• Nabu’s user interface is currently text-based. This is
flexible because it can be used on any platform and with
any client, but is neither convenient nor user-friendly.
Graphical frontends would be desirable, preferably
integrated in client software (via plugins). Other options are
a web frontend or integration in frameworks for desktop
search.
• Evaluation is needed to find out whether the chosen
privacy model meets the user requirements. This needs
experience from daily use of ”real users“, as different
usage patterns need different privacy models. At the
moment, there is always one policy active at a time. The
advantage is that it is easy to manage and clear which
policy is used for the current chat. It would also be possible
to specify a policy for specific chats, e.g., ”everything
I write in MUC room #workgroup should be readable
by the whole workgroup“. While this is more
powerful, it has the disadvantage is that the user could forget
about the channel-specific setting and share information
with more people than intended. A third option would be
to always use restrictive privacy settings when logging
(i.e., only participants can read messages). Users would
manually share the log afterwards by marking the
conversation in their client plugin and assigning a less
restrictive policy. This usage pattern is already supported,
the user must just leave the default policy active, and
assign other custom policies to logged messages.
• Currently Nabu is only accessible via the XMPP
protocol. In order to make the repository available to software
without requiring an XMPP library, it should be made
possible to query the archive using HTTP(S)/XML–
RPC/SOAP protocols.</p>
    </sec>
    <sec id="sec-15">
      <title>Acknowledgments</title>
      <p>This work has been supported by a grant from The Federal
Ministry of Education, Science, Research, and Technology
(FKZ ITW–01 IW C01).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>[Andy</surname>
          </string-name>
          Seaborne et al.,
          <year>2005</year>
          ]
          <string-name>
            <given-names>Andy</given-names>
            <surname>Seaborne</surname>
          </string-name>
          et al.
          <source>Jena Semantic Web Framework</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [
          <string-name>
            <surname>Berners-Lee</surname>
          </string-name>
          et al.,
          <year>2001</year>
          ]
          <string-name>
            <given-names>Tim</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>James</given-names>
            <surname>Hendler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Ora</given-names>
            <surname>Lassila</surname>
          </string-name>
          .
          <source>The Semantic Web. Scientific American</source>
          ,
          <volume>284</volume>
          (
          <issue>5</issue>
          ):
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Dengel et al.,
          <year>2002</year>
          ]
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Dengel</surname>
          </string-name>
          , Andreas Abecker,
          <string-name>
            <surname>Jan-Thies Ba</surname>
            ¨hr, Ansgar Bernardi,
            <given-names>Peter</given-names>
          </string-name>
          <string-name>
            <surname>Dannenmann</surname>
          </string-name>
          , Ludger van Elst,
          <string-name>
            <surname>Stefan Klink</surname>
            , Heiko Maus, Sven Schwarz, and
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Sintek</surname>
          </string-name>
          .
          <article-title>Evolving Personal to Organizational Knowledge Spaces</article-title>
          .
          <source>Project Proposal, DFKI GmbH Kaiserslautern</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>[Dennis Quan and Karger</source>
          , 2003] David Huynh Dennis Quan and
          <string-name>
            <given-names>David R.</given-names>
            <surname>Karger</surname>
          </string-name>
          .
          <article-title>Haystack: A platform for authoring end user semantic web applications</article-title>
          .
          <source>In International Semantic Web Conference</source>
          , pages
          <fpage>738</fpage>
          -
          <lpage>753</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>[Eisenstadt and Dzbor</source>
          , 2002]
          <string-name>
            <given-names>Marc</given-names>
            <surname>Eisenstadt</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Dzbor</surname>
          </string-name>
          .
          <article-title>BuddySpace: Enhanced Presence Management for Collaborative Learning</article-title>
          , Working, Gaming and Beyond.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          Submission to JabberConf
          <source>Europe</source>
          <year>2002</year>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Eric Prud'hommeaux</source>
          , 2005]
          <article-title>Andy Seaborne (edts) Eric Prud'hommeaux. Sparql query language for rdf</article-title>
          .
          <source>W3c working draft, W3C</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>[Karneges and Paterson</source>
          , 2004]
          <string-name>
            <given-names>Justin</given-names>
            <surname>Karneges</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ian</given-names>
            <surname>Paterson</surname>
          </string-name>
          . JEP-
          <volume>0136</volume>
          : Message Archiving.
          <source>Jabber Enhancement Proposal</source>
          ,
          <year>2004</year>
          . URL http://www.jabber.org/jeps/jep-0136.html.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [Quan et al.,
          <year>2003</year>
          ]
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Quan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Karun</given-names>
            <surname>Bakshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and David R.</given-names>
            <surname>Karger</surname>
          </string-name>
          .
          <article-title>A unified abstraction for messaging on the semantic web</article-title>
          .
          <source>In WWW (Posters)</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [
          <string-name>
            <surname>Saint-Andre</surname>
          </string-name>
          ,
          <year>2002</year>
          ]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Saint-Andre.</surname>
          </string-name>
          JEP-0045:
          <string-name>
            <given-names>MultiUser</given-names>
            <surname>Chat. Jabber Enhancement Proposal</surname>
          </string-name>
          ,
          <year>2002</year>
          . URL http://www.jabber.org/jeps/jep-0045.html.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[Sauermann and Schwarz</source>
          , 2004]
          <string-name>
            <given-names>Leo</given-names>
            <surname>Sauermann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sven</given-names>
            <surname>Schwarz</surname>
          </string-name>
          .
          <article-title>Introducing the gnowsis semantic desktop</article-title>
          .
          <source>In Proceedings of the International Semantic Web Conference</source>
          <year>2004</year>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>[Schwarz</source>
          , 2005]
          <string-name>
            <given-names>Sven</given-names>
            <surname>Schwarz</surname>
          </string-name>
          .
          <article-title>A Context Model for Personal Knowledge Management</article-title>
          .
          <source>In Proceedings of the IJCAII'05 Workshop on Modeling and Retrieval of Context, Edinburgh</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>