<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nikolas Landia</string-name>
          <email>N.Landia@warwick.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sarabjot Singh Anand</string-name>
          <email>S.S.Anand@warwick.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Warwick</institution>
          ,
          <addr-line>Coventry CV4 7AL</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Personalised tag recommenders are becoming increasingly important since they are useful for many document management applications including social bookmarking websites. This paper presents a novel approach to the problem of suggesting personalised tags for a new document to the user. Document similarity in combination with a user similarity measure is used to recommend personalised tags. In case the existing tags in the system do not seem suitable for the userdocument pair, new tags are generated from the content of the new document as well as existing documents using document clustering. A first evaluation of the system was carried out on a dataset from the social bookmaking website, Bibsonomy1. The results of this initial test indicate that adding personalisation to an unsupervised system through our user similarity measure gives an increase in the precision score of the system.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Categories and Subject Descriptors</title>
      <p>Algorithms
tag recommendation, tag extraction, social bookmarking</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>With the advancement of Web 2.0 and social bookmarking
websites, the recommendation of tags for these systems is
becoming increasingly important. The major difficulties with
tag recommendation for social bookmarking are that tags
are related to specific users and the documents, users and
tags might be unknown to the recommender system.
Suggested tags have to be personalised and the system has to
be able to successfully recommend tags for scenarios where
the document, user and/or appropriate tag are not in the
training set.</p>
      <p>
        Existing approaches to automated tagging include
supervised as well as unsupervised tag recommender systems.
Classifiers [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and clustering [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] have been used when a set
of predefined tags is known and the task is to assign these
tags to new documents. However, these supervised systems
do not have a solution to the new tag problem, akin to the
new item problem in recommender systems. Clustering can
be used to generate tags from the vocabulary of the
document set and is thus able to overcome the new tag problem.
However, tags from a vocabulary different from that of the
document’s authors will not be recommended. The
supervised and unsupervised techniques on their own ignore
information about users that may be available to the system.
People differ in their interests (documents/topics),
vocabulary (tags) and context. In order to successfully recommend
tags to users, vocabulary and tagging habits have to be taken
into consideration.
      </p>
      <p>In this paper we suggest a solution to this problem that
consists of clustering the existing documents in order to
identify sets of similar documents which in turn identifies
the set of users whose tags may be propagated to the
current target user-document pair. A list of potential tags for
the new user-document pair is obtained from the tags of the
similar documents. A score is then calculated for each
potential tag by taking a weighted combination of the similarity of
the document that the tag is assigned to and the similarity
of the user who assigned it, averaged over all posts the tag
appears in. If the score of a tag is below a set threshold,
it means that this tag is unsuitable for the user-document
pair. When the number of suitable tags is below a
predefined number t, new tags are generated from the content of
the document.</p>
    </sec>
    <sec id="sec-3">
      <title>DOCUMENT CLUSTERING</title>
      <p>
        There are various techniques that can be used to
cluster documents. The clustering approach presented by Song
et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] takes into consideration document words and
associated tags. Song’s approach clusters documents into a
predefined number of clusters nc. When finding the cluster
for a new document, each of the nc clusters is considered
and a cluster membership vector is generated for the new
document. The labelling of a document thus has a linear
time complexity O(nc). In our unsupervised approach the
number of clusters is defined by the data and indirectly by
the number of tags w considered in document clustering.
When finding the cluster for the new document, the whole
tree of clusters is considered, so that a document dealing
with a specialised topic is assigned to a leaf node while a
document dealing with a more general, broad topic is
assigned to a cluster higher up the tree. Nevertheless, the
labelling of documents in our system has a time complexity
of O(log nc) as explained later. The main advantage of our
approach is that it can find the level of specialisation of a
document in a topic area and the potential set of tags for the
new document reflect its level of specialisation. Moreover,
the system is able to generate new tags.
      </p>
      <p>The algorithm proposed here is a two part unsupervised
document clustering method consisting of a divisive and an
agglomerative clustering part. The aim of the algorithm is
to organise documents into reasonable clusters and define
a predefined number of unpersonalised tags, w, for the
documents based on the clusters they belong to. Note that
these w tags will be generated from words which appear in
the word corpus of the document dataset.</p>
      <p>
        In the first stage of clustering (the divisive part),
documents are clustered using bisecting K-Means [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], for which
different initialisation techniques were examined. Division
of clusters is performed until the documents in a cluster
satisfy the condition that they share at least s of their w
most important words (defined by their tf-idf-score) within
the cluster. The second clustering stage (the agglomerative
part) takes the final set of clusters generated by the
divisive part and merges them hierarchically to create a tree
structure of clusters.
      </p>
      <p>
        The document representation used for clustering is
bagof-words. Stop-words are removed and the dimensionality
of the data set is reduced to n using document frequency,
which proved to be an effective and cheap technique for
dimensionality reduction in the experiments carried out by
Yang and Pedersen [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Divisive Part
1. define the number of tags w required to be assigned
to each document. This parameter, along with the
parameter s below, indirectly determines the number of
document clusters (leaf nodes in the document
hierarchy)
2. for each document, find w words with the highest tf-idf
score, tfidf d, as the tags for that document
3. put all documents in one cluster
4. find w words as the tags on the cluster level for this
cluster (see section on finding tags at cluster level)
5. check if the cluster has only documents which share at
least s tags with the cluster level, 1&lt;=s&lt;=w. If yes
final cluster for this recursion path was found, stop; if
no proceed to split the cluster into two
6. split the cluster by finding two new cluster centroids
(as explained in Initialising KMeans below) and
assigning each of the documents in the cluster to one of
the new centroids using cosine similarity.</p>
      <p>7. for each of the two new clusters, perform steps 4-6.</p>
      <sec id="sec-3-1">
        <title>Finding Tags at Cluster Level</title>
        <p>Tags at the cluster level can be found by
• taking the t words with highest average tf-idf score
across all documents in the cluster
• recalculating the tf-idf on cluster level tfidf c, treating
clusters as single documents and the words from all
documents within a cluster as members of this single
document
The second approach of recalculating the tf-idf values at
the cluster level has two advantages. Firstly, it makes sure
that the tags found for the given cluster are good for
distinguishing it from other clusters. Moreover, it is interesting to
observe the change in cluster tags on different levels of the
clustering hierarchy. On the lowest level where clusters are
smallest, the cluster tags will be very specific to the topic
of the documents that are in the cluster. Higher up the
tree, when documents of different specialised topics share
the same cluster, the words that are common between these
documents and at the same time distinguish them from
documents in other clusters will be chosen as cluster tags. The
tags assigned to clusters on different levels in the tree will
thus create a hierarchy of topics from general to specialised.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Initialising K-Means</title>
        <p>
          The standard K-Means algorithm is initialised with
random points for centroids [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. This is the cheapest with
regard to cost, however the choice of the location of cluster
centroids is not based on any underlying rationale. A better
method is to find two points which are far apart from each
other as centroids.
        </p>
        <p>Our approach is to select the cluster centroids with the
goal of sooner satisfying the end condition of the clustering
process. The end condition in our case is that all documents
assigned to a cluster share at least s of their top w words
with the tags of the cluster. When faced with a cluster
that has to be split, we find the mean of the documents
in the cluster that do satisfy the end condition and set the
centroid of the first cluster to this mean. The candidates
for the centroid of the second cluster are all the documents
which do not satisfy the end condition. From these, the
document that is farthest away from the first centroid by
cosine similarity is set as the second cluster centroid. An
alternative is to select the document for which the tfidf d
scores for words which are tags of the cluster are lowest.</p>
        <p>
          One problem that arises when clustering is that
documents which do not share any attributes with any of the
two cluster centroids will have a cosine similarity of zero to
both centroids and thus cannot be assigned to one of them.
This is overcome by slightly pulling in the cluster centroids
towards the mean of all documents (described as “shrinking”
in the CURE Algorithm [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]) which eliminates values of zero
for attributes.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Agglomerative Part</title>
      <p>
        During the divisive step, documents are partitioned
greedily and research has shown that a second pass (using an
agglomerative clustering approach) across the resulting
clusters (leaf nodes) can help improve the clustering quality by
reversing sub-optimal splits occurring in the divisive step
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Hence, after the divisive step we use the leaf nodes of
the tree and perform agglomerative clustering on them as
described below.
      </p>
      <p>1. start with clusters produced by divisive part
2. find the two clusters for which the mean vectors are
closest by cosine similarity and merge them
3. repeat step 2 until a desired number of clusters r is
reached or until all clusters are merged into one
4. find tags for all clusters by calculating tf-idf on cluster
level (tfidf c) based on the documents in the clusters
δ for each of the documents in Dsim, further documents are
added to Dsim by travelling up the tree from the cluster
and adding all documents in the clusters on the path until a
cluster containing a document with dsim(docnew , doci) &lt; δ
is encountered.</p>
    </sec>
    <sec id="sec-5">
      <title>User Similarity</title>
      <p>The similarity between two users can be calculated in
three different ways.</p>
      <p>The document set similarity considers how many
document are shared between the two users and is calculated
using the Jaccard co-efficient.
|Da ∩ Db|
|Da ∪ Db|
|Va ∩ Vb|
|Va ∪ Vb|
simD(ua, ub) =
simV (ua, ub) =
where Da and Db are the document sets of the two users.</p>
      <p>The vocabulary similarity measures the overlap in the two
users’ vocabulary sets and is given by
where Va and Vb are the sets of tags used by the two users.</p>
      <p>The tagging similarity considers whether the two users
assigned the same tags to the same documents and is
calculated as follows.</p>
      <p>The result is a hierarchy of cluster merges, with each cluster
having w tags which form a topic hierarchy from general to
specialised from the root to leaf nodes.</p>
    </sec>
    <sec id="sec-6">
      <title>GENERATING NEW TAGS</title>
      <p>To generate new tags for a document, the tf-idf score in
the document is calculated for each of its words and a
userdefined number k of the words with highest scores are
proposed as candidate tags. The target document is also
assigned to a cluster within the cluster tree generated by the
clustering algorithm (see section on finding the document
neighbourhood) and the tags associated with the cluster are
also added to the candidate list of tags for the target
document. Note that some of these words may not appear in
the target document and are assigned to the document on
the basis of its similarity with other members of the same
cluster characterised by these words. The score assigned to
a tag is the average between the tf-idf scores on document
level (tfidf d) and cluster level (tfidf c).</p>
      <p>wscore(word ) =
tfidf d(word ) + tfidf c(word )
2</p>
    </sec>
    <sec id="sec-7">
      <title>FINDING PERSONALISED TAGS</title>
      <p>suit(ua, tagi) =
simT (ua, ub) =</p>
      <p>The initial set of potential tags for the new document-user
pair (ua , docnew ) consists of all tags assigned to documents
in the δ-neighbourhood of the new document by any user
(see section on finding document neighbourhood below). For
each of these tags, a weight is calculated which measures
the suitability of the tag based on the similarity scores of
the documents that are related to the tag and the similarity
scores of the users who assigned that tag. The set of users
that we are interested in consist of the users who assigned
tags to one or several documents in the neighbourhood of
the new document.</p>
      <p>The t tags with the highest suitability value are finally
recommended to the active user. The formula for calculating
the suitability of each tag is as follows.
where |poststagi | is the number of posts for tagi, dsim and
usim are the document and user similarity measures
(described below) and β ≤ 1 is the weight given to document
similarity compared to user similarity.</p>
      <p>If there are not enough tags in the resulting tag set with
a suitability score above a user-defined threshold, new tags
are generated as described in the section above.
where Tdia and Tdib are the tag sets of the two users for
document i. The tagging similarity measure incorporates
document set similarity through the division by |Da ∪ Db|.</p>
      <p>This ensures that the tagging similarity reflects not only the
documents for which two users have common tags but also
takes into account the number of documents for which the
two users do not have common tags.</p>
      <p>The tagging similarity measure is the most valuable since
it takes into consideration both, document and tag sets and
the relationship between these, while the other two measures
1 X dsim(docnew , doci)β·usim(ua, ub)(1−foβc)us on only one of these aspects. However, due to
spar|poststagi | i sity of the data the two users often do not have any shared
document-tag pairs which means the tagging similarity is
zero. Hence, the tagging similarity is combined with
vocabulary similarity which results in zero less frequently. A
tagging similarity of zero and a non-zero vocabulary similarity
indicates that the users share some tags but have assigned
them to different documents, thus the tags used by user ua
might be suitable for ub. The combined user similarity score
is given by
|Tdia∩Tdib|</p>
      <p>X
|Tdia ∩ Tdib|
|Tdia ∪ Tdib|
1
|Da ∪ Db|
i=1</p>
    </sec>
    <sec id="sec-8">
      <title>Finding the Document Neighbourhood</title>
      <p>As described previously the document clustering algorithm
results in a cluster hierarchy where each parent node has two
child nodes. To select a set of similar documents for a new
document from the cluster hierarchy, the tree is traversed
from the top down and the new document is assigned to one
of the two clusters at each branch split. Thus a path through
the tree is obtained for the new document in O(log nc) time.
From this path, the cluster with the highest cosine
similarity is assigned to the document as its cluster and the
documents in the cluster added to the set of similar documents
Dsim. For each of the documents in Dsim, the similarity to
the new document dsim(docnew , doci) is calculated by cosine
similarity. If the document similarity is above a threshold
usim(ua, ub) = αsimT (ua, ub) + (1 − α)simV (ua, ub)
where α ≤ 1 specifies the weight given to the tagging
similarity compared to the vocabulary similarity.</p>
    </sec>
    <sec id="sec-9">
      <title>EVALUATION</title>
      <p>
        At the time of writing, the implementation of the
suggested system is not fully completed and an extensive
evaluation to determine the best thresholds and parameters still
has to be done. An initial evaluation was carried out on
a dataset from the social bookmarking website Bibsonomy.
The dataset consists of a variety of websites and academic
papers that were bookmarked on Bibsonomy and the tags
that were assigned to these documents by the users. The
post-core at level 5 was used so each user, document and
tag appeared at least five times in the data, resulting in
7538 posts (user, document pairs) containing 244 users, 953
documents and 811 tags. Similarly to [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], leave-one-out cross
validation was used to evaluate the algorithm. Each
document was represented using a vector space model, where the
dimensions were the 1000 words with largest document
frequency in the corpus. Five tags were recommended for each
test post and the performance measure was standard F1.
Since the test dataset contained only users and tags already
known to the system no new tags were generated.
      </p>
      <p>The most significant parameters of our recommender
system are thresholds on document and user similarity. These
thresholds allow us to consider only documents and users
with a similarity score higher than the set value when
calculating the scores for the potential tags. If we set the
threshold for document similarity to 1.0, only tags assigned to the
new document in other posts (by other users) are considered.
This is referred to as the set of popular tags in other systems
such as the current del.icio.us tag recommender. While in
other systems the score for each tag is given simply by the
number of posts it appears in, our system also weights each
post’s importance by considering the user similarity to the
active user, hence personalising the tag recommendation. If
we set the threshold for user similarity to 1.0, only the tags
from the active user’s other posts are considered. This is
referred to as the set of personal tags. For this scenario
each tag’s score is influenced by the similarity of the new
document to the documents in the user’s posts.</p>
      <p>When tested on their own, the approach of popular tags
produced better recommendations (F1 of 0.25) than the
personal tags (F1 of 0.13). However, the best results were
achieved through generating two sets of tags, one from each
approach and combining them to give a final
recommendation set. The two sets were combined by a weighted sum of
the two scores for each tag in order to increase the overall
score of tags which appeared in both sets. The final tag
scores were then normalised. By recommending tags from
the combined set of popular and personal tags, the F1 was
increased to 0.35. The results of the combined approach
when recommending different sizes of tag sets is shown in
Figure 1 below. To additionally evaluate our method of
generating new tags, we set the threshold on the tag score
to 1.0 in another test run so that all recommended tags were
generated from the content of the documents. The resulting
F1 was 0.12 when generating five tags for each post.
0.6
0.5
0.4
0.3
0.2
0.1
0
precision
recall</p>
      <p>F1</p>
    </sec>
    <sec id="sec-10">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>In this paper we suggested a novel approach to
personalised tag recommendation. By creating a cluster tree of
documents we generate a topic hierarchy from general to
specialised, and determine the level of specialisation of a
new document. The recommended tags reflect the
generality of the new document. Through identifying a set of users
similar to the active user and measuring their similarity, we
are able to suggest personalised tags. If the required number
of tags cannot be found in the existing tag set, new tags are
generated from the vocabulary of the document corpus.</p>
      <p>
        We plan to perform an extensive evaluation of the
suggested system on different datasets in order to find the best
values for the many parameters such as the weights given
to the different similarity measures and thresholds for tag
suitability. We will also evaluate our methods ability to
address the new-user/new-document/new-tag instantiation of
the new-item problem well known in Recommender
literature. To further improve the tag recommender, we plan
to implement and evaluate different clustering and cluster
representation techniques such as those used by CURE [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>In the future, we plan to investigate different approaches
to tag recommendation, taxonomy generation and
dimensionality reduction. Techniques for feature extraction such
as finding synonym sets and topic models for documents are
very interesting and have the potential increase the
performance of any tag recommendation or classification system.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Bradley</surname>
          </string-name>
          and
          <string-name>
            <given-names>U. M.</given-names>
            <surname>Fayyad</surname>
          </string-name>
          .
          <article-title>Refining initial points for K-Means clustering</article-title>
          .
          <source>In Proc. 15th International Conf. on Machine Learning</source>
          , pages
          <fpage>91</fpage>
          -
          <lpage>99</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gemmell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schimoler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ramezani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          .
          <article-title>Adapting k-nearest neighbor for tag recommendation in folksonomies</article-title>
          .
          <source>In Proc. 7th Workshop on Intelligent Techniques for Web Personalization and Recommender Systems</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rastogi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Shim. CURE</surname>
          </string-name>
          :
          <article-title>An efficient clustering algorithm for large databases</article-title>
          .
          <source>In Proc. 1998 ACM SIGMOD International Conference on Management of Data</source>
          , pages
          <fpage>73</fpage>
          -
          <lpage>84</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Jaeschke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Marinho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hotho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schmidt-Thieme</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Stumme</surname>
          </string-name>
          .
          <article-title>Tag recommendations in folksonomies</article-title>
          .
          <source>In Proc. 11th European Conference on Principles and Practice of Knowledge Discovery in Databases</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-C.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Giles</surname>
          </string-name>
          .
          <article-title>Real-time automatic tag recommendation</article-title>
          .
          <source>In Proc. 31st international ACM SIGIR conference on Research and development in information retrieval</source>
          , pages
          <fpage>515</fpage>
          -
          <lpage>522</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Steinbach</surname>
          </string-name>
          , G. Karypis, and
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          .
          <article-title>A comparison of document clustering techniques</article-title>
          .
          <source>Technical Report 00-034</source>
          , University of Minnesota,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. O.</given-names>
            <surname>Pedersen</surname>
          </string-name>
          .
          <article-title>A comparative study on feature selection in text categorization</article-title>
          .
          <source>In Proc. ICML-97, 14th International Conference on Machine Learning</source>
          , pages
          <fpage>412</fpage>
          -
          <lpage>420</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ramakrishnan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Livny</surname>
          </string-name>
          . BIRCH:
          <article-title>An efficient data clustering method for very large databases</article-title>
          .
          <source>In Proc. ACM SIGMOD '96: Int. Conf. on Management of Data</source>
          , pages
          <fpage>103</fpage>
          -
          <lpage>114</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>