<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Two-Level Learning Hierarchy of Concept Based Keyword Extraction for Tag Recommendations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hendri Mur</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Klaus Obermayer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Neural Information Processing Group, TU Berlin Franklinstr.</institution>
          <addr-line>28/29, 10587 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Textual contents associated to resources are considered as sources of candidate tags to improve the performance of tag recommenders in social tagging systems. In this paper, we propose a twolevel learning hierarchy of a concept based keyword extraction method to lter the candidate tags and rank them based on their occurrences in concepts existing in the given resources. Incorporating user-created tags to extract the hidden concept-document relationships distinguishes the two-level from the one-level learning version, which extracts concepts directly using terms existing in textual contents. Our experiment shows that a multi-concept approach, which considers more than one concept for each resource, improves the performance of a single-concept approach, which takes into account just the most relevant concept. Moreover, the experiments also prove that the proposed two-level learning hierarchy gives better performances than one of the one-level version.</p>
      </abstract>
      <kwd-group>
        <kwd>Recommender system</kwd>
        <kwd>Social tagging</kwd>
        <kwd>Machine learning</kwd>
        <kwd>Keyword extraction</kwd>
        <kwd>Concept extraction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Social tagging is intended to make resources increasingly easy to discover and
recover over time. Discovery enables users to nd new content of their interest
shared by other users. This social indexing gives a promising index quality
because it is done by human beings, who understand the content of the resource, as
opposed to software, which algorithmically attempts to determine its meaning.
Moreover, it is done collectively among users, that is, it uses a collective human
intelligence as an index extractor. Recovery enables a user to recall content that
was discovered before. It should be easier because the tags are both originated
by, and familiar to, its primary users. However, Golder et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] identify three
major problems with the current social tagging systems: polysemy, synonymy,
and level variation. The rst two inherit the problems of natural language, while
the third one refers to the phenomenon of users tagging content at di erent levels
of abstraction. Other problems are dealing with word forms, nouns in singular,
nouns in plural, abbreviations, and misspelled words.
      </p>
      <p>
        To direct users towards the consistency of the tags, the system usually has a
service that assists users in the tagging process, by automatically recommending
an appropriate set of tags. The service is a mediated suggestion system, that
is, the service does not apply the recommended tags automatically, rather it
suggests a set of appropriate tags and allows the user to select tags from the
set they nd appropriate. Moreover, the tag recommendation can serve many
purposes such as consolidating the vocabulary across the users, giving a second
opinion what a resource is about and, the important thing, increasing the success
of searching because of the consistency [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        In practice, the standard tag recommenders are services that recommend
the most popular tags used for either a particular resource or a whole system.
There are other methods proposed from a diversity of approaches to recommend
tags from user-created tags (folksonomy) such as information retrieval [
        <xref ref-type="bibr" rid="ref23 ref28">23, 28</xref>
        ],
graph-based approaches [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], collaborative ltering [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], machine learning [
        <xref ref-type="bibr" rid="ref10 ref15">10,
15</xref>
        ]. Recently, people consider textual contents associated to the resources as
sources of candidate tags to improve the performance of tag recommenders. For
example, Xu et al. [31] suggest content-based (and context-based) tags based
on analysis and classi cation of the tagged content and context. This not only
solves the cold start problem, but also increases the tag quality of those objects
that are less popular. Tatu et al. [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] use natural language processing tools to
extract important terms (nouns, adjectives and named entities) from the textual
contents. They conclude that the understanding of the contents improves the
quality of the tag recommendations.
      </p>
      <p>In this paper, we also consider the textual contents associated to resources
as sources of candidate tags to improve the performance of the tag recommender
in the social tagging system. To achieve this goal, we propose a two-level
learning hierarchy of concept based keyword extraction as a tag recommendation
method. Firstly, the method extracts concepts, which can be considered as a
set of related words, using nonnegative matrix factorization (NMF) from
training document collections using a two-level learning hierarchy: at the lower level
the method extracts concepts and concept-document relationships using
usercreated tags. Having these relationships, the method populates the concepts
with terms existing in textual contents of resources at the higher level. Next,
the tag recommender nds the relevant concepts to a given resource and then
scales terms of the resource based on their occurrences in the concepts. The
terms having the highest scores are set as keywords and recommended as tags.
Incorporating the user-created tags to extract the hidden concept-document
relationships distinguishes the two-level from the one-level learning version, which
extracts concepts directly from terms existing in textual contents. The main
advantage of this approach is that NMF algorithm decomposes more compact
document representations. Also, the concept extraction from textual contents
is handled by nonnegative least squares algorithm which is much more e cient
than NMF algorithm. Therefore, the two-level learning hierarchy approach is
not only more e cient but also more reliable because it uses tags created by
users who understand the content of documents. Moreover, the approach may
have richer vocabularies because it can combine vocabularies from tag space and
content space. Our experiment shows that a multi-concept approach, which
considers more than one concept for each resource, improves the f-measure values
of a single-concept approach, which takes into account just the most relevant
concept, about 10%. Moreover, the experiments also prove that the proposed
two-level learning hierarchy has f-measure values 13% better than one of the
one-level version.</p>
      <p>The rest of the paper is organized as follows: Section 2 discusses a concept
extraction method using nonnegative matrix factorization and our proposed
two-level learning hierarcy method. Section 3 describes the existing keyword
extraction methods and the proposed concept based keyword extraction
methods. In Section 4, we describe a tag recommendation algorithm which combines
keywords, which are extracted by the keyword extraction methods, with
usercreated tags in training data. In Section 5, we show our experiments and results.
We conclude and give a summary in Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Concept Extraction</title>
      <p>
        Many researchers are trying to address questions about concepts and, in this
section, we consider one of them that de nes the concepts as a set of related
terms. These de nitions are proposed and used by some researchers such as
[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] or [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. They use clustering methods to extract the concepts from training
document collections. Formal concept analysis (FCA) [
        <xref ref-type="bibr" rid="ref26 ref5">5,26</xref>
        ] and Latent semantic
analysis (LSA) [
        <xref ref-type="bibr" rid="ref3 ref7">3, 7</xref>
        ] are other methods to perform this task .
2.1
      </p>
      <sec id="sec-2-1">
        <title>A One-Level Learning Hierarchy for Concept Extraction</title>
        <p>
          There are some disadvantages of singular value decomposition (SVD) to extract
concepts from a document collection as used by LSA. Its negative values make
a semantic interpretation di cult. What we would really like to say is that
a concept is mostly concerned with some subset of terms, but any semantic
interpretation is di cult because of these negative values. To circumvent this
problem, a new method which maintains the nonnegative structure of original
documents has been proposed. The method uses nonnegative matrix factorization
(NMF) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] rather than SVD to extract the concepts from document collections.
        </p>
        <p>
          Let V be a m n term-by-document matrix whose columns are document
vectors and a positive integer k &lt; min(m; n). In this paper, we use NMF to
extract concepts from the term-by-document matrix V . NMF problem is how
to nd a nonnegative m k matrix W and a nonnegative k n matrix H to
minimize the functional [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]:
min
W;H
f (W; H) = 1 Xm Xn
2
The constrained optimization problem above is convex on either W or H, but not
on both, hence realistic possible solutions usually correspond to local minima.
The product W H is called a nonnegative matrix factorization of V , although
V is not necessarily equal to the product W H. Clearly the product W H is an
rank-k approximation to V . An appropriate decision on the value of k is critical
in practice, but the choice of k is very often problem dependent. In most cases,
however, k is usually chosen such that k &lt;&lt; min(m; n).
        </p>
        <p>
          The most popular approach for the NMF problem is the multiplicative
update algorithm proposed by Lee and Seung [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. To either overcome shortcomings
related to convergence properties or to speed up this algorithm, researchers have
proposed modi cations of the algorithm or even created new ones [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. In general,
the algorithms can be divided into three general classes: multiplicative update
algorithms [
          <xref ref-type="bibr" rid="ref18 ref19">18,19</xref>
          ], gradient descent algorithm [
          <xref ref-type="bibr" rid="ref12 ref6">6,12</xref>
          ], and alternating least squares
algorithms [
          <xref ref-type="bibr" rid="ref20 ref24">20, 24</xref>
          ].
        </p>
        <p>Because all elements of the matrix W and H are nonnegative, we can interpret
them immediately as following: Each column of W corresponds to a set of related
terms called concepts and each element wia of matrix W represents the degree
to which term i belongs to concept a. Each element haj of matrix H represents
the degree to which document j is associated to concept a. Next, we call this
type of concept extraction as an one-level learning hierarcy method.
In case where the training documents are accompanied by user-created keywords,
it is a good idea to incorporate the valuable information in learning process. For
this reason, we propose a new learning scheme that uses the keywords for
extracting concepts from a document collection. The learning scheme consists of
two-level learning hierarchy. At the lower level, concepts and concept-document
relationships are discovered using the user-created keywords. Having these
relationships, the concepts are populated by terms existing in textual contents of
documents at higher level. We expect this method to be successful because the
hidden document structures are discovered using keywords collectively created
by users. Another advantage of this approach is that NMF algorithm uses more
compact document representations. On the other hand, the concept extraction
from textual contents is handled by nonnegative least squares algorithm which
is much more e cient than NMF algorithm. Therefore, this two-level learning
hierarchy approach is not only more e cient but also more reliable because it
uses tags created by users who understand the content of documents. Moreover,
the approach may have richer vocabularies because it can combine
vocabularies from tag space and content space. The detail algorithm of this method is
described in Algorithm 1.</p>
        <p>Algorithm 1 A two level learning hierarchy for concept extraction</p>
        <sec id="sec-2-1-1">
          <title>1: Let V be the tag by document matrix, and X be the term by document matrix</title>
        </sec>
        <sec id="sec-2-1-2">
          <title>2: Find the tag by concept matrix W and the concept by document matrix H from V = W H using nonnegative matrix factorization (see Section 2.1) to minimize the functional:</title>
          <p>f (W; H) = 21 kV</p>
          <p>
            W Hk2
3: Find the term by concept matrix T from X = T H using nonnegative least squares
algorithm, e.g. [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ]:
{ Solve for T in matrix equation HHT T T = HXT
{ Set all negative elements in T to 0
3
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Concept Based Keyword Extraction</title>
      <p>
        Keyword extraction is the task of automatically selecting a small set of
important, topical terms within the textual content of a document. The fact that the
keywords are extracted means that the selected terms are present in the
document [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In general, the task of automatically extracting keywords can be
divided into two stages:
1. Selecting candidate terms in the document
2. Filtering out the most signi cant ones to serve as keywords and rejecting
those that are inappropriate
There are various methods proposed for selecting candidate terms. The rst one
is n-gram extraction, that is, extracting uni-, bi-, or tri-grams, removing those
that begin or end with a stop word [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Another one is more linguistically
oriented using natural language processing (NLP) method such as NP-chunker or
part-of-speech (PoS) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Filtering uses either simple statistics, where a
weighting schema is applied to rank words accoding to their score [
        <xref ref-type="bibr" rid="ref1 ref25">1, 25</xref>
        ], or machine
learning, where the ranking function is de ned by a statistical model derived
from training set with manually assigned keywords [
        <xref ref-type="bibr" rid="ref13 ref22">13, 22, 30</xref>
        ].
      </p>
      <p>In this section, we propose a machine learning based ltering method, that
is, a method that uses concepts extracted from textual contents of documents.
The method nds the relevant concepts to a given document and then scales
terms of the document based on their occurrences in the concepts. The terms
having the highest scores are set as keywords. The method can be considered
as unsupervised learning when we use the one-level learning hierarchy. It means
that the method does not need labeled data for the training process. Moreover,
the method becomes a supervised method when the user-created tags are used
in learning process for the two-level learning hierarchy approach. Two variants
of the concept based keyword extraction method are described in detail in the
following sections.
3.1</p>
      <sec id="sec-3-1">
        <title>Single-Concept Based Keyword Extraction</title>
        <p>The two-level learning hierarchy extracts concepts from a training document
collection. Having these concepts, the single-concept based keyword extraction
method nds the most relevant concept to a given document and then scales
the candidate terms existing in the document based on their occurrence in the
concept. The relevance of a concept c with a document d is calculated using the
following cosine distance measure:
rel(d; c) =</p>
        <p>dT Vc
kdk kVck
(2)
The detail algorithm of this approach is described in Algorithm 2.
Algorithm 2 The single-concept based keyword extraction
1: Let column c of W (Wc) be concept c and d be a document
2: Let rel(d; c) = kddkTkWWcck
3: Find the most relevant concept to the document d, i.e., concept c0 where c0 =
argmaxc(rel(d; c))
4: Scale terms existed in the document based on the most relevant concept, i.e.,
d1i = wic0 di
d2j = tjc0 dj</p>
        <sec id="sec-3-1-1">
          <title>5: Combine the normalized terms:</title>
          <p>6: Select rst n non zero terms of the ranked d~ as keywords
The multi-concept based keyword extraction assumes that a document may
contain more than one relevant concept. The detail algorithm of the multi-concept
based keyword extraction method is described in Algorithm 3.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>A Hybrid Tag Recommender</title>
      <p>In our experiment, we use a hybrid recommender as described in detail in
Algorithm 4. The recommender checks if a given resource exists in the training data.
Algorithm 3 The multi-concept based keyword extraction
1: Let column c of W (Wc) be concept c and d be a document
2: Let rel(d; c) = dT Wc
3: Scale terms exisktdekdk Winckthe document using the concepts, i.e.,
d1i = X rel(d; c)wicdi</p>
      <p>c
d2j = X rel(d; c)tjcdj</p>
      <p>c</p>
      <sec id="sec-4-1">
        <title>4: Combine the normalized terms:</title>
        <p>5: Select rst n non zero terms of the ranked d~ as keywords
If this is the case then the tag space based recommenders are suggested. The
collaborative recommender is used if a given user has pro les in system.
Otherwise, the most popular tag by resource method is used as tag recommender.
If the resource appears for the rst time then the recommender examines the
content of the resource using the concept based keyword extraction algorithm.
Boosting the extracted tags if they have been used by the user before. If
neither any tags nor any keywords are suggested then the most popular tags in the
training data are recommended. Using the user-created tags aims to direct the
standardization and consistency of supplied tags, while using the tags extracted
from textual contents intend especially to overcome the cold start problem.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experiment</title>
      <p>We apply our proposed recommender methods (Algorithm 4) for ECML PKDD
Discovery Challenge 20091. The task of the competition requires the development
of a content-based tag recommendation method for BibSonomy2, a web based
social bookmarking system that enables users to tag both web pages (bookmark)
and scienti c publications (bibtex). The organizers of the competition made
available a training set of examples consisting of the resources accompanied
with their user-created tags. A testing data will be provided in order to evaluate
proposed recommenders. Each bookmark is described by its URL, a description
of the URL that usually is the title of the web page and an extended description
of the bookmark supplied by the user. Each bibtex is associated with values
of bibtex elds such as title, author, booktitle, journal, series, volume, number,
etc. BibtexKey, bibtexAbstract, URL, and description of the publication can be
speci ed. Some statistics of the data are shown in Table 1.</p>
      <p>In our experiment, we use textual contents associated to each resource as
content of the resources. For the bookmark, the contents are the description of
the URL and the extended description. Title and abstract are textual contents
associated to the bibtex. A bookmark is identi ed by its URL address (url hash)
attribute and a bibtex by its title (simhash1 ) attribute. Therefore, a document,
bookmark or bibtex, is represented by the description given to the document by
all users that bookmarked the document.</p>
      <p>Let D be a testing data set, consisting of jDj examples (ri; Ti); i = 1:::jDj.
Let Ti be the set of tags created by users for a resource ri and Pi be the set
of tags predicted by a recommender for a resource ri. The precision, recall, and
F-measure for recommender f on testing data set D is calculated as follows:
Precision =</p>
      <p>Recall =
F</p>
      <p>Measure =
jDj i=1
jDj i=1
1 XjDj jTi \ Pij
1 XjDj jTi \ Pij
jPij
jTij
1 XjDj 2 jTi \ Pij
jDj i=1 jPij + jTij
1 http://www.kde.cs.uni-kassel.de/ws/dc09
2 http://www.bibsonomy.org</p>
      <p>We perform our experiment in java platform and use Lucene3 for creating the
tag-by-resource matrix and the term-by-resource matrix. The other processes are
conducted on the Weka4 framework, an open source machine learning software.
Concept-based keyword extraction . For creating the term-by-resource matrix,
resources are parsed and a dictionary of terms is created using a standard word
tokenization method. The terms are words, special characters are removed, and
Snowball Porter stemming and standard stop words of English and German are
applied. Finally, the term-by-resource matrix is created using a term frequency
weighting scheme.</p>
      <p>
        Extracting concepts from the term-by-resource matrix is an important step to
nd keywords from new resources. The optimal number of concepts (k), which
captures most concepts in the training document collection, remains di cult
to nd. The method that is usually used for a practical purpose is a
heuristic approach. However, because of the memory usage, simulations are usually
conducted on the maximum number of concepts that can be extracted. In our
experiment, we extract 200 concepts for the training document collection. For
this task, we use the nonnegative double SVD initialization method [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] that
conducts no randomization and the projected gradient method [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] that converges
to a local minimum. We expect these combining methods leads to converge to a
unique solution with a minimum error.
      </p>
      <p>There is another parameter that should be optimized in the two-level
learning hierarchy approach. The parameter re ects the portion of the tag space and
the content space as sources of tags for the recommender. In our experiment,
we set the parameter = 0:25 for the single-concept method and = 0:05 for
the multi-concept method, which are the optimal values we get using a heuristic
method.</p>
      <p>
        Collaborative recommendation [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] . For a given tag-by-user matrix X, a given
user u, a given resource r, and integer k and n, the set T (u; r) of n recommended
tags is calculated by:
      </p>
      <p>T (u; r) = argmaxtn2T X sim(Xu; Xv) (v; t; r)
v2Nuk
(3)
where Nuk is k nearest neighbors of u in X, (v; t; r) = 1 if (v; t; r) 2 f olksonomy
and 0 else. Therefore, the only parameter to be tuned is the number of neighbors
k. For that, multiple runs where performed where k incremented until a point
where no more improvement in the results were observed.
3 http://lucene.apache.org/
4 http://www.cs.waikato.ac.nz/ml/weka/
Most popular tags by resource . For a given resource we count how many posts
a tag occur together with that resource. We use tags that occur most often
together with that resource as recommendation.</p>
      <p>Most popular tags . For each tags we count in how many posts it occurs. We
then use tags that occur most often as recommendation.
5.2</p>
      <sec id="sec-5-1">
        <title>Single- vs. Multi-Concept Method</title>
        <p>Fig. 1 shows the performances of the single- and multi-concept based keyword
extraction on testing data. From Fig. 1, we can calculate that recall, precision,
and f-measure of the multi-concept approach are, on average, 10%, 15% and
12%. The recall is likely to increase when the number of recommended tags gets
bigger, while the precision is reduced for the bigger numbers of tags. Fig. 1 also
shows the performance of the single-concept approach in the similar pattern
and its f-measure is, on average, 11%. From both curves, we conclude that the
multi-concept approach, which assumes that a resource may contain more than
one concept, improves f-measure of the single-concept method, on average, 10%.
The improvement occurs in all numbers of recommended tags. These results
verify that associating of resources with more than one concept gives better
performance than just considering the main concept of resources. In other words,
some minor concepts of a resource should also be examined for getting the better
performance of the keyword extraction.</p>
        <p>0.2
0.15</p>
        <p>Fig. 1. Performance comparison of single- and multi-concept approach</p>
      </sec>
      <sec id="sec-5-2">
        <title>One- vs. Two-Level Learning Hierarchy</title>
        <p>In this section, we examine the performance of our proposed two-level
learning hierarchy approach compared to the one-level version. Fig. 2 shows the
performance of the one-level learning hierarchy multi-concept based keyword
extraction and the two-level learning hierarchy multi-concept based keyword
extraction. From the gure, we see that the two-level learning method has better
recall, precision and f-measure. Its f-measure values, on average, are 13% better
than one of the one-level learning approach. The detailed recall, precision, and
f-measure values of the optimal performance of the two-level learning hierarchy
are given in Table 2.</p>
        <p>0.2
0.15</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Summary</title>
      <p>In this paper, we propose a two-level learning hierarchy concept based keyword
extraction method for task1 of ECML PKDD Discovery Challenge 2009, that
is, a content-based tag recommendation. The tag recommendation method
explores tags from textual contents of resources using concepts existing in the
textual contents of the resources. A multi-concept approach, which considers
more than one concept for each resource, improves the performance of a
singleconcept approach, which only considers the most relevant concept. Moreover, our
experiment demonstrates that the proposed two-level learning hierarchy method
outperforms the common one-level learning approach for all performance
measures, e.g. recall, precision, and f-measure.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The authors would like to thank Nicolas Neubauer for useful discussions,
comments and suggestions in writing this paper. The rst author was funded partly
by a scholarship by the DAAD.
30. P. D. Turney. Learning algorithms for keyphrase extraction. Information Retrieval,
2(4):303{336, 2000.
31. Z. Xu, Y. Fu, J. Mao, and D. Su. Towards the semantic web: Collaborative tag
suggestions. In Proceeding of Collaborative Web Tagging Workshop, 2006.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>K.</given-names>
            <surname>Barker</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Cornacchia</surname>
          </string-name>
          .
          <article-title>Using noun phrase heads to extract document keyphrases</article-title>
          .
          <source>In Proceedings of the 13th Canadian Conference on Arti cial Intelligence</source>
          , pages
          <fpage>417</fpage>
          {
          <fpage>426</fpage>
          , Montreal, Canada,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Berry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Browne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Langville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pauca</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Plemmons</surname>
          </string-name>
          .
          <article-title>Algorithms and applications for approximate nonnegative matrix factorization</article-title>
          .
          <source>Computational Statistics and Data Analysis</source>
          ,
          <volume>15</volume>
          (
          <issue>1</issue>
          ):
          <volume>155</volume>
          {
          <fpage>173</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Berry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Dumais</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>W. O'Brien</surname>
          </string-name>
          .
          <article-title>Using linear algebra for intelligent information retrieval</article-title>
          .
          <source>SIAM Review</source>
          ,
          <volume>37</volume>
          :
          <fpage>573</fpage>
          {
          <fpage>595</fpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>C.</given-names>
            <surname>Boutsidis</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Gallopoulos</surname>
          </string-name>
          .
          <article-title>Svd-based initialization: A head start on nonnegative matrix factorization</article-title>
          .
          <source>Pattern Recognition</source>
          ,
          <volume>41</volume>
          (
          <issue>4</issue>
          ):
          <volume>1350</volume>
          {
          <fpage>1362</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>C.</given-names>
            <surname>Carpineto</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Romano</surname>
          </string-name>
          .
          <article-title>Using concept lattices for text retrieval and mining</article-title>
          . In B. G. et al., editor,
          <source>Formal Concept Analysis</source>
          , pages
          <volume>161</volume>
          {
          <fpage>179</fpage>
          . Springer-Verlag,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Diele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Plemmons</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Ragni</surname>
          </string-name>
          . Optimality, computation, and
          <article-title>interpretations of nonnegative matrix factorization</article-title>
          .
          <source>Technical report</source>
          , Department of Mathematics, North Carolina State University,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Deerwester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Dumais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. K.</given-names>
            <surname>Landauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. W.</given-names>
            <surname>Furnas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Harshman</surname>
          </string-name>
          .
          <article-title>Indexing by latent semantic analysis</article-title>
          .
          <source>Journal of the American Society of Information Science</source>
          ,
          <volume>41</volume>
          (
          <issue>6</issue>
          ):
          <volume>391</volume>
          {
          <fpage>407</fpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. E. Frank,
          <string-name>
            <given-names>G. W.</given-names>
            <surname>Paynter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutwin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Nevill-Manning</surname>
          </string-name>
          .
          <article-title>Domain-speci c keyphrase extraction</article-title>
          .
          <source>In Proceedings of the International Joint Conference on Arti cial Intelligence</source>
          , pages
          <fpage>668</fpage>
          {
          <fpage>673</fpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>S.</given-names>
            <surname>Golder</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Huberman</surname>
          </string-name>
          .
          <article-title>Usage patterns of collaborative tagging systems</article-title>
          .
          <source>Journal of Information Science</source>
          ,
          <volume>32</volume>
          (
          <issue>2</issue>
          ):
          <volume>198</volume>
          {
          <fpage>208</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>P.</given-names>
            <surname>Heymann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramage</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H. G.</given-names>
            <surname>Molina</surname>
          </string-name>
          .
          <article-title>Social tag prediction</article-title>
          .
          <source>In Proceeding of SIGIR</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>A.</given-names>
            <surname>Hotho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jaschke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schmitz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Stumme</surname>
          </string-name>
          .
          <article-title>Information retrieval in folksonomies: Search and ranking</article-title>
          .
          <source>In Proceeding of the 3rd European Semantic Web Conference</source>
          , pages
          <volume>411</volume>
          {
          <fpage>426</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>P.</given-names>
            <surname>Hoyer</surname>
          </string-name>
          .
          <article-title>Non-negative matrix factorization with sparseness constraints</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>5</volume>
          :
          <fpage>1457</fpage>
          {
          <fpage>1469</fpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>A. Hulth.</surname>
          </string-name>
          <article-title>Combining Machine Learning and Natural Language Processing for Automatic Keyword Extraction</article-title>
          .
          <source>PhD thesis</source>
          , Stockholm University,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>R.</given-names>
            <surname>Jaschke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marinho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hotho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S.</given-names>
            <surname>Thieme</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Stumme</surname>
          </string-name>
          .
          <article-title>Tag recommendations in folksonomies</article-title>
          . In J. N.
          <string-name>
            <surname>Kok</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Koronacki</surname>
            , R. L. de Ma'ntaras,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Matwin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Mladenic</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Skowron, editors,
          <source>The 11th European Conference on Principles and Practice of Knowledge Discovery in Databases</source>
          , pages
          <volume>506</volume>
          {
          <fpage>514</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. I. Katakis, G. Tsoumakas,
          <string-name>
            <surname>and I. Vlahavas.</surname>
          </string-name>
          <article-title>Multilabel text classi cation for automated tag suggestion</article-title>
          .
          <source>In Proceeding of the ECML/PKDD Discovery Challenge</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>F. W.</given-names>
            <surname>Lancaster</surname>
          </string-name>
          .
          <article-title>Indexing and Abstracting in Theory and Practice</article-title>
          . Library Association Publishing,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Lee</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Seung</surname>
          </string-name>
          .
          <article-title>Learning the parts of objects by nonnegative matrix factorization</article-title>
          .
          <source>Nature</source>
          ,
          <volume>401</volume>
          :
          <fpage>788</fpage>
          {
          <fpage>791</fpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Lee</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Seung</surname>
          </string-name>
          .
          <article-title>Algorithms for non-negative matrix factorization</article-title>
          .
          <source>In Advances in Neural Information Processing System</source>
          , pages
          <volume>556</volume>
          {
          <fpage>562</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>C. J. Lin</surname>
          </string-name>
          .
          <article-title>On the convergence of multiplicative update algorithms for non-negative matrix factorization</article-title>
          .
          <source>IEEE Transactions on Neural Networks</source>
          ,
          <volume>18</volume>
          :
          <fpage>1589</fpage>
          {
          <fpage>1596</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>C. J. Lin</surname>
          </string-name>
          .
          <article-title>Projected gradient methods for non-negative matrix factorization</article-title>
          .
          <source>Neural Computation</source>
          ,
          <volume>19</volume>
          :
          <fpage>2756</fpage>
          {
          <fpage>2779</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>D.</given-names>
            <surname>Lin</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Pantel</surname>
          </string-name>
          .
          <article-title>Concept discovery from text</article-title>
          .
          <source>In Proceedings of the International Conference on Computational Linguistics</source>
          , pages
          <volume>577</volume>
          {
          <fpage>583</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>O.</given-names>
            <surname>Medelyan</surname>
          </string-name>
          and
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          .
          <article-title>Domain-independent automatic keyphrase indexing with small training set</article-title>
          .
          <source>Journal of the American Society for Information Science and Technology</source>
          ,
          <volume>59</volume>
          (
          <issue>7</issue>
          ):
          <volume>1026</volume>
          {
          <fpage>1040</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>G.</given-names>
            <surname>Mishne. Autotag</surname>
          </string-name>
          :
          <article-title>A collaborative approach to automated tag assignment for weblog posts</article-title>
          .
          <source>In Proceeding of the 15th International Conference on World Wide Web</source>
          , pages
          <volume>953</volume>
          {
          <fpage>954</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <given-names>P.</given-names>
            <surname>Paatero</surname>
          </string-name>
          and
          <string-name>
            <given-names>U.</given-names>
            <surname>Tapper</surname>
          </string-name>
          .
          <article-title>Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values</article-title>
          .
          <source>Environmetrics</source>
          ,
          <volume>5</volume>
          :
          <fpage>111</fpage>
          {
          <fpage>126</fpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <given-names>C.</given-names>
            <surname>Paice</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Black</surname>
          </string-name>
          .
          <article-title>A three-pronged approach to the extraction of key terms and semantic roles</article-title>
          .
          <source>In Proceedings of the International Conference on Recent Advances in NLP</source>
          , pages
          <volume>357</volume>
          {
          <fpage>363</fpage>
          ,
          <string-name>
            <surname>Borovets</surname>
          </string-name>
          , Bulgaria,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. U. Priss.
          <article-title>Formal concept analysis in information science</article-title>
          .
          <source>Annual Review of Information Science and Technology</source>
          ,
          <volume>40</volume>
          :
          <fpage>521</fpage>
          {
          <fpage>543</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>M. L. Reinberger</surname>
            and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Spyns</surname>
          </string-name>
          .
          <article-title>Unsupervised text learning for the learning of dogma-inspired ontologies</article-title>
          . In P. Buitelaar,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          , and B. Magnini, editors,
          <source>Ontology Learning from Text: Method, Application and Evaluation</source>
          , pages
          <volume>29</volume>
          {
          <fpage>43</fpage>
          . IOS Press,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Sood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Owsley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Hammond</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Birnbaum</surname>
          </string-name>
          . Tagassist:
          <article-title>Automatic tag suggestion for blog posts</article-title>
          .
          <source>In Proceeding of the International Conference on Weblogs and Social Media (ICWSM)</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>M. Tatu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Srikanth</surname>
          </string-name>
          , and
          <string-name>
            <surname>T. D'Silva</surname>
          </string-name>
          . Rsdc'
          <volume>08</volume>
          :
          <article-title>Tag recommendation using bookmark content</article-title>
          .
          <source>In Proceeding of the ECML/PKDD Discovery Challenge</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>