=Paper= {{Paper |id=Vol-273/paper-15 |storemode=property |title=Organizing Publications and Bookmarks in BibSonomy |pdfUrl=https://ceur-ws.org/Vol-273/paper_25.pdf |volume=Vol-273 |dblpUrl=https://dblp.org/rec/conf/www/JaschkeGHKSS07 }} ==Organizing Publications and Bookmarks in BibSonomy== https://ceur-ws.org/Vol-273/paper_25.pdf
     Organizing Publications and Bookmarks in BibSonomy

                          Robert Jäschke1,2                                            Miranda Grahl1

                           Andreas Hotho1                                              Beate Krause1,2

                         Christoph Schmitz1                                            Gerd Stumme1,2
                 1
                     Knowledge & Data Engineering Group, Dept. of Mathematics and Computer Science
                           Univ. of Kassel, Wilhelmshöher Allee 73, D-34121, Kassel, Germany
                                               {lastname}@cs.uni-kassel.de

                                            2
                                             Research Center L3S, Univ. of Hannover,
                                            Appelstr. 9a, D-30167 Hannover, Germany

ABSTRACT                                                              a blend of the words ‘taxonomy’ and ‘folk’, and stands for con-
BibSonomy is a web-based social resource sharing system which         ceptual structures created by the people. Folksonomies are thus
allows users to organise and share bookmarks and publications in a    a bottom-up complement to more formalized Semantic Web tech-
collaborative manner.                                                 nologies, as they rely on emergent semantics [8, 9] which result
   Apart from standard folksonomy features such as an intuitive       from the converging use of the same vocabulary.
user interface, navigation along all dimensions, or browser inte-        A typical user interface allows for exploration of the folksonomy
gration via RSS feeds, BibSonomy provides tag hierarchies, group      in all dimensions: for a given user one can see all resources he has
management and privacy features, and numerous import and export       uploaded, together with the tags he has assigned to them; when
functions.                                                            clicking on a resource one sees which other users have uploaded
                                                                      this resource and how they tagged it; and when clicking on a tag
                                                                      one sees who assigned it to which resources (see Figure 1).
Categories and Subject Descriptors                                       After an introduction to the user interface and architecture of
H.3.4 [Information Systems]: Systems and Software                     BibSonomy, we give an overview about some of its advanced fea-
                                                                      tures. For the underlying data structure of the system and more
Keywords                                                              details, we refer to [2].
folksonomies                                                          1.1      User Interface
                                                                         BibSonomy with its more than 5,000 registered users allows to
1.   SHARING BOOKMARKS AND PUBLI-                                     share bookmarks (i.e., URLs) as well as publication references.
     CATIONS WITH BibSonomy                                           The data model of the publication part is based on BIBTEX [6], a
   Social resource sharing systems are web-based systems used to      popular literature management system for LATEX [5]. A typical list
manage resources on the web in a collaborative way. Users can         of posts is depicted in Figure 1 which shows bookmark and publi-
describe the resources with arbitrary words, so-called tags. The      cation posts in a column layout containing the tag web. The page
systems can be distinguished according to what kind of resources      is divided into four parts: the header (showing information such as
are supported. Flickr, for instance, allows the sharing of photos,    the current page and path, navigation links and search boxes), two
del.icio.us the sharing of bookmarks, CiteULike1 and Connotea2        lists of posts – one for bookmarks and one for publications – each
the sharing of bibliographic references, and 43Things3 even the       sorted by date in descending order, and a list of tags related to the
sharing of personal goals and resolutions. Our own system, Bib-       posts. This scheme holds for all pages showing posts and allows
Sonomy,4 can be used for sharing bookmarks and BIBTEX entries         for navigation in all dimensions of the folksonomy.
simultaneously. In their core, these systems are all very similar.       A detailed view of one bookmark post from the list in Figure 1
Once a user is logged in, he can add a resource to the system and     can be seen in Figure 2. The first line shows in bold the title of
assign arbitrary tags to it.                                          the bookmark which has the URL of the bookmark as underly-
   The collection of all users’ tag assignments is called a folkso-   ing hyperlink. The second line shows an optional description the
nomy. The word ‘folksonomy’ (coined by Vander Wal in [10]) is         user can assign to every post. The last two lines belong together
                                                                      and show detailed information: first, all the tags the user has as-
1
  http://www.citeulike.org                                            signed to this post (web, service, tutorial, guidelines and api), sec-
2
  http://www.connotea.org                                             ond, the user name of that user (hotho) followed by a note, how
3
  http://www.43things.com                                             many users tagged that specific resource. These parts have un-
4
  http://www.bibsonomy.org                                            derlying hyperlinks, leading to the corresponding tag pages of the
Copyright is held by the author/owner(s).                             user (/user/hotho/web 5 , /user/hotho/service, . . . ),
WWW2007, May 8–12, 2007, Banff, Canada.                               5
.                                                                         All paths given in parentheses are relative to http://www.
                 Figure 1: BibSonomy displays bookmarks and BIBTEX based bibliographic references simultaneously.


the users page (/user/hotho) and a page showing all four posts            editors of the publication, the journal or book title and the year.
(i. e., the one of user hotho and those of the 3 other people) of this    The following lines show the tags assigned to this post by the user,
resource (/url/r, where r is a hashed representation of the re-           whose user name comes next, followed by a note on how many
source). The last part shows the posting date and time followed           people tagged this publication. As described for bookmark posts,
by links for actions the user can do with this post – depending on        these parts link to the respective pages. After date and time of
whether this is his own post (edit, delete) or another user’s post        the posting follow the actions the user can do, which in this case
(copy).                                                                   include picking the entry for later download, copying it, accessing
                                                                          the URL of the entry or viewing the BIBTEX source code.

                                                                          1.2     Additional Features
                                                                            This section briefly describes some extensions of BibSonomy
                                                                          which go beyond the basic folksonomy model and have evolved
                                                                          during the practical use of the system.

       Figure 2: Detail showing a single bookmark post                    1.2.1     Tag Hierarchy
                                                                              Tagging gained so much popularity in the past two years because
                                                                          it is simple and no specific skills are needed for it. Nevertheless the
                                                                          longer people use systems like BibSonomy, the more often they ask
                                                                          for options to structure their tags. A user specific binary relation ≺
                                                                          between tags as described in our model of a folksonomy (cf. [2] for
                                                                          details) is an easy way to arrange tags.
                                                                              Therefore we included this possibility in BibSonomy and ex-
                                                                          tended it further to use it for conceptual navigation. For instance, it
                                                                          is possible, given a tag, to show all posts with one of the subtags of
                                                                          the given tag.
                                                                              Figure 4 shows details of the relation management in BibSon-
                                                                          omy. Relations can be edited manually, but usually they will be
                                                                          created in the normal tagging process by tagging a resource, e. g.
       Figure 3: Detail showing a single publication post
                                                                          as eclipse->java, expressing that the tags eclipse and java should
                                                                          be assigned to a particular resource, and that the pair (eclipse, java)
    The structure of a publication post displayed in BibSonomy is         should be inserted into the user’s ≺ relation. When browsing the
very similar, as shown in Figure 3. The first line shows again the        folksonomy, a user can decide if she wants to make use of the tag
title of the post, which equals the title of the publication in BIBTEX.   hierarchy, e. g. when querying for the tag java, resources that are
It has an underlying link leading to a page which shows detailed          tagged with eclipse can be included in the result set, even if they do
information on that post. This line is followed by the authors or         not have the tag java themselves. In order to distinguish between
bibsonomy.org.                                                            simple tag queries and those involving subtags, we call the latter
                                       Figure 4: Relation and tag editor, with relations in the sidebar


one a query for java as a concept.                                         ing BIBTEX snippets into BibSonomy, or just marking a BIBTEX en-
  While it is not enforced upon the user how the ≺ relation is to          try on a web page and hitting the post button. Furthermore, for nu-
be used, we expect that most of the time, it will be used in order to      merous digital library services including the ACM Digital Library,
express subsumptions in an is-a hierarchy. The actual use we are           SpringerLink, arXiv, and CiteSeer, automatic screen scrapers for
observing confirms this assumption.                                        publication metadata are provided.
                                                                              For unstructured publication metadata, such as the publication
1.2.2       Duplicate detection.                                           lists often found on researchers’ web pages, semi-automatic ex-
   In particular for literature references there is the problem of de-     traction using the Mallet7 information extraction tool is supported.
tecting duplicate entries, because there are large variations in how
users enter fields such as journal name or author. On the one hand          1.2.4    Export facilities.
it is desirable to allow a user to have several entries which differ          The content of any page in the BibSonomy user interface can
only slightly. On the other hand one might want to find other users’       be presented in different formats for export. Exporting BIBTEX
entries which refer to the same paper or book even if they are not         is accomplished by preceding the path of an URL with the string
completely identical. Hence it is necessary to map these entries           /bib – this returns all publications shown on the respective page in
together to allow such browsing functionality.                             BIBTEX format. For example the page http://www.bibsonomy.org/
   To fulfill both goals we implemented two hashes to compare pub-         bib/search/text+clustering returns a BIBTEX file containing all liter-
lication entries at different levels of granularity. One is used to warn   ature references which contain the words “text” and “clustering” in
the user if she posts very similar BIBTEX entries twice, possibly          their fulltext.
creating unwanted redundancy. The other one is used to aggregate              HTML-formatted8 publication lists can be exported which can
BIBTEX entries that were posted by more than one user in a com-            be easily included into personal homepages. For example, a user
mon view, providing an opportunity to pick the most complete one           schmitz might want to tag his own publications with myown and
or copy over missing data to one’s own version of the publication          the year of publication. This enables him to include the page http://
entry. The implemented solution does not allow for a common cre-           www.bibsonomy.org/publ/user/schmitz/myown+2006 into his per-
ation of a single (correct) entry in a wiki style as we want to allow      sonal web page to get an automatically updated list of his publica-
every user to store BIBTEX in the way she likes without changes            tions on the web.
from other users.                                                             Other exports such as XML, RSS and BURST9 feeds, RDF ac-
                                                                           cording to the SWRC ontology, BIBTEX and EndNote work simi-
1.2.3       Import of resources.                                           larly.
   To encourage users to transition from other systems, we imple-             Furthermore, links can be provided to an OpenURL10 resolver.
mented an import functionality. For del.icio.us, this functionality        Such a resolver allows every user to find any publication presented
also takes into account the del.icio.us bundles which are named sets       in BibSonomy in her own local library.
of tags. We map bundles to relations. Furthermore it is possible to
                                                                            1.2.5    Group Management and Privacy.
import bookmark files of the Firefox6 web browser, where the typi-
                                                                           7
cal folder hierarchy of the bookmarks can be added to the user’s ≺            http://mallet.cs.umass.edu/
                                                                           8
relation.                                                                     A small heuristic is applied to handle special letters e.g. with
   Existing BIBTEX entries can be imported by uploading files, past-        accents but LATEX commands are not used to format the output.
                                                                            9
                                                                              http://www.cs.vu.nl/∼pmika/research/burst/BuRST.html
6                                                                          10
    http://www.mozilla.com/firefox/                                           http://www.exlibrisgroup.com/sfx openurl.htm
                                                                              standard for web services which should also be taken into ac-
   In many situations it is desirable to share resources only among           count. Since the process of defining an API for BibSonomy
certain people. If the resources can be public, then one could agree          has just started, this is still an open task.
to tag them with a special tag and use that tag to find the shared
resources. The disadvantage is, that this could be undermined by        3. REFERENCES
other users (or spammers) by using the same tag. To solve this
                                                                         [1] Roy T. Fielding. Architectural Styles and the Design of
problem and also to allow resources to be visible only for certain
                                                                             Network-based Software Architectures. PhD thesis,
users, we introduced groups in BibSonomy which gives users more
                                                                             University of California, Irvine, 2000.
options to decide with whom they share their resources.
                                                                         [2] Andreas Hotho, Robert Jäschke, Christoph Schmitz, and
                                                                             Gerd Stumme. BibSonomy: A social bookmark and
2.       CONCLUSION AND OUTLOOK                                              publication sharing system. In Proceedings of the
                                                                             Conceptual Structures Tool Interoperability Workshop at the
2.1       Summary                                                            14th International Conference on Conceptual Structures,
   BibSonomy is, to the best of our knowledge, the only folkso-              pages 87–102, 2006.
nomy system currently online which combines bookmark and pub-            [3] Andreas Hotho, Robert Jäschke, Christoph Schmitz, and
lication management in a common user interface.                              Gerd Stumme. Information retrieval in folksonomies: Search
   In addition to standard folksonomy features, BibSonomy pro-               and ranking. In Proceedings of the 3rd European Semantic
vides capabilities for structuring a user’s tag cloud, group and pri-        Web Conference, LNCS, pages 411–426, Budva,
vacy management, and various import and export options, includ-              Montenegro, June 2006. Springer.
ing screen scrapers for popular publication services as well as for      [4] Robert Jäschke, Andreas Hotho, Christoph Schmitz,
unstructured publications lists.                                             Bernhard Ganter, and Gerd Stumme. Trias - an algorithm for
2.2       Ongoing and Future Work                                            mining iceberg tri-lattices. Hong Kong, December 2006. (to
                                                                             appear).
   There are several important topics which we will be address in
                                                                         [5] Leslie Lamport. LaTeX: A Document Preparation System.
the near future. As stated in the introduction folksonomies can be
                                                                             Addison-Wesley, 1986.
seen as a lightweight knowledge representation. One major goal
is therefore the convergence with the semantic web effort which          [6] Oren Patashnik. BibTeXing, 1988. (Included in the BIBTEX
is also called Web 3.0. To reach this goal a more machine un-                distribution).
derstandable tagging is needed which can be reached be using so          [7] Christoph Schmitz, Andreas Hotho, Robert Jäschke, and
called “machine tags”11 but also by developing new methods to ex-            Gerd Stumme. Mining association rules in folksonomies. In
tract semantics from folksonomies. Our next steps in this direction          Proc. IFCS 2006 Conference, pages 261–270, Ljubljana,
but also to enhance the usability are as follows:                            July 2006.
                                                                         [8] S. Staab, S. Santini, F. Nack, L. Steels, and A. Maedche.
Ranking The FolkRank [3] ranking algorithm has been developed                Emergent semantics. Intelligent Systems, IEEE [see also
     which allows for a topic-specific ranking of folksonomy re-             IEEE Expert], 17(1):78–86, 2002.
     sources. Incorporating a ranking scheme to enhance the sim-         [9] L. Steels. The origins of ontologies and communication
     ple reverse-chronological presentation of posts is ongoing              conventions in multi-agent systems. Autonomous Agents and
     work.                                                                   Multi-Agent Systems, 1(2):169–194, October 1998.
Conceptual Clustering and Community Detection We are cur-               [10] Thomas Vander Wal. Folksonomy definition and wikipedia.
     rently investigating different approaches [4, 3, 7] for finding         November 2005.
     coherent clusters within the folksonomy; these clusters could
     be viewed as communities of users being interested in com-
     mon topics.
Tag Recommender and Ontology Learning As users are provid-
     ing tag-tag relations in BibSonomy, we are currently investi-
     gating techniques which enable the semi-automatic learning
     of the tag-tag relation. A first step in that direction is the
     learning of subsumption relations using association rules on
     the folksonomy data set [7]. The same techniques can also
     be used to generate tag recommendations.
API Experience has shown, that an Application Programming In-
    terface (API) is crucial for a folksonomy system to gain suc-
    cess. It is something which has been requested by many peo-
    ple and which allows for easy interaction of BibSonomy with
    other systems. Hence we are currently investigating several
    approaches to add an API to BibSonomy. Most systems use
    lightweight APIs similar to the idea of REST [1] which can
    be used and accessed in a very straightforward and easy-to-
    implement fashion. Nevertheless, with SOAP12 there exists a
11
     http://www.flickr.com/groups/api/discuss/72157594497877875/
12
     http://www.w3.org/TR/soap/