<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards User Profile-based Interfaces for Exploration of Large Collections of Items</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Claudia Becerra</string-name>
          <email>cjbecerrac@unal.edu.co</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergio Jimenez</string-name>
          <email>sgjimenezv@unal.edu.co</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Gelbukh</string-name>
          <email>gelbukh@cic.ipn.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Instituto Politécnico Nacional, Centro de Investigación en</institution>
          ,
          <addr-line>Computación, Mexico, D.F, http://nlp.cic.ipn.mx/</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidad Nacional de Colombia</institution>
          ,
          <addr-line>Bogotá - Colombia, www.unal.edu.co</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Universidad Nacional de Colombia</institution>
          ,
          <addr-line>Bogotá - Colombia, www.unal.edu.co</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <fpage>9</fpage>
      <lpage>16</lpage>
      <abstract>
        <p>Collaborative tagging systems allow users to describe and organize items using labels in a free-shared vocabulary (tags), improving their browsing experience in large collections of items. At present, the most accurate collaborative filtering techniques build user profiles in latent factor spaces that are not interpretable by users. In this paper, we propose a general method to build linear-interpretable user profiles that can be used for user interaction in a recommender system, using the well-known simple additive weighting model (SAW) for multi-attribute decision making. In experiments, two kinds of user profiles where tested: one from free contributed tags and other from keywords automatically extracted from textual item descriptions. We compare them for their ability to predict ratings and their potential for user interaction. As a test bed, we used a subset of the database of the University of Minnesota's movie review systemMovielens, the social tags proposed by Vig et al. (2012) in their work “The Tag Genome”, and movie synopses extracted from the Netflix's API. We found that, in “warm” scenarios, the proposed tag and keyword-based user profiles produce equal or better recommendations that those based on latent-factors obtained using matrix factorization. Particularly, the keyword-based approach obtained 5.63% of improvement. In cold-start conditions-movies without rating information, both approaches perform close to average. Moreover, a user profile visualization is proposed arising an accuracy vs. interpretability tradeoff between tag and keywordbased profiles. While keyword-based profiles produce more accurate recommendations, tag-based profiles seems to be more readable, meaningful and convenient for creating profile-based user interfaces.</p>
      </abstract>
      <kwd-group>
        <kwd>Recommender systems</kwd>
        <kwd>collaborative filtering</kwd>
        <kwd>collaborative tagging systems</kwd>
        <kwd>social tagging</kwd>
        <kwd>user interfaces</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>An approach for improving the exploration of large collections of
items such as books (librarything.com), films (netflix.com),
pictures (flickr.com), research papers (citeulike.com) and web
bookmarks (del.icio.us) is the leveraging of collaborative
information from the users. This approach allows the knowledge
of certain individuals on certain items in the collection propagates
towards other users. In this way, a self-generated collaborative
intelligence guides users in their exploration by recommendations
tailored to their preferences and away from dislikes.</p>
      <p>Currently, collaborative filtering approaches derive user profiles
and produce recommendations based primarily on user feedback
whether explicit (e.g. ratings, “likes”, tagging, reviews) or implicit
(e.g. web logs). As the time goes by, user profiles grow while
their preferences evolve. Generally, users are allowed to update
their explicitly given information with the aim of adjusting their
profiles to get better recommendations. In this scenario, when a
user wants to update his (her) profile, it depends—for instance—
on a large number of ratings making of this a difficult and even
overwhelming task. The users should make a significant number
of targeted edits in their profiles to obtain the desired effect. The
situation worsens in systems based on implicit feedback where
user profiles are not interpretable nor accessible by users.
Most of the state-of-the-art methods for collaborative filtering
build user profiles projected in latent factor spaces. These latent
factors reduce considerably the dimensionality of the user profiles
providing more accurate recommendations at the expense of
interpretability. Unfortunately, users cannot make modifications
on these low-dimensional and highly informative profiles. A first
step to tackle this issue could be the design of interfaces based on
interpretable user profiles. For instance Lops et al. [16] proposed a
system where the user profiles are defined in a space indexed by
keywords automatically extracted from textual item descriptions
—keyword-based user profiles. However, in many cases the
number of extracted keywords is similar or even larger than the
number of items in the collection making it difficult the
interaction of users with their profiles.</p>
      <p>
        Alternatively, user profiles can also be built using tags
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]—tagbased user profiles. These tags come from collaboratively tagging
systems [29], which allows users in large collections to label
items using a shared free vocabulary. As a result of this social
indexing process [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], the system gradually collects a social
index, which enables users to classify, visualize and query items
in a way that is both personalized and social. Unfortunately, social
indexes suffer of misspellings, typographical errors and extremely
particular tags, making of them a noisy resource for the
construction of meaningful user profiles. Sen et al. (2009) [23]
proposed an entropy-based measure and a cleaning procedure for
detecting a community-valuable tag set from a noisy social index.
They obtained a clean set of 1,128 tags from nearly 30,000
different tags collected by the MovieLens1 system during the year
2009. Clearly, this tag set has a more convenient size for
designing user interfaces for customizing user profiles based on
social tags.
      </p>
      <p>
        In this paper, we propose a method based on matrices for building
linear user profiles based either on social tags or on automatically
extracted keywords. From the users’ point of view, these profiles
behave as a linear simple aggregative weighting model SAW [28],
that is one of the most comprehensive method for multi-attribute
decision making [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. So, the proposed method discovers the
prior weights, or the users’ affinity coefficients with tags or
keywords, that minimize the rating prediction error. These
produced profiles—SAW user profiles—can be used either to
invite users to interact with their own profiles or to explain the
recommendations given by the system.
      </p>
      <p>
        To evaluate the performance of SAW user the profiles, they were
compared against user profiles based on latent factors obtained
using matrix factorization techniques [15], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This comparison
was made in the rating prediction task for the movie domain. We
observed that the proposed methods outperformed or reached
similar results in cross-validation and cold-start evaluation
settings (respectively) in comparison with strong baselines. That
is the main contribution of this work: a collaborative method to
obtain simple aggregative weighting user profiles without
compromising rating prediction accuracy.
      </p>
      <p>In addition, a visualization of user profiles is provided with the
aim of analyzing the potential of SAW user profiles for the
construction of user interfaces for recommender systems. In that
visualization the profile of a single user is shown as a list of tags,
or keywords, ranked by preference. We argue that the
hypothetical user interaction with the top and the bottom of that
list would provide a mechanism for updating his user profile with
little effort. Simultaneously, the profiles of the nearest users are
also shown as a collaborative resource for suggesting updates.</p>
    </sec>
    <sec id="sec-3">
      <title>2. RELATED WORK</title>
      <p>
        There have been several works that let users directly interact with
keyword-based user profiles or tag-based user profiles. For
example, the work of Pazzani and Billsus (1997) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is the earliest
system that let users directly interact with their keyword-based
user profiles. In that work, users directly assess the conditional
probability of liking or disliking a resource given that a particular
word is found in the resource’s textual description. These
userprovided conditional probabilities are used as priors to train a
Naïve Bayes classifier that, using users’ ratings, estimates the
probability of liking or disliking the resource using keywords as
resource features. They found that these prior profiles increase the
accuracy of the recommendations obtained by the Naïve Bayes
classifier, mainly in cold-start scenarios [21] when users have not
yet given enough ratings.
      </p>
      <p>
        Another example is the work of Diederich &amp; Iofciu (2006) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In
their work, users directly interact with manually build tag-based
user profiles as a way to query the system for obtaining
recommendations. They used the digital library DBLP2, where
items (research papers) are labeled with tags manually specified
by the authors. In a first stage, the system prepares a tag-based
author profile aggregating the tags associated to the works of the
author (see Table 1). Then, users can get recommendations of
similar authors by using a query profile in which users change the
coefficients assigned to the tags. With this query profile, the
system recommends similar authors to the one queried using
collaborative filtering approaches [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The main limitation of the above mentioned approaches is that
only first order relations between user and resource are considered
to build these profiles. Consequently these approaches are
incapable to find new tags or keywords relevant to the profile.
Other approaches integrate collaborative tagging information, and
keywords found in textual descriptions of resources, in
algorithms that outperform classic collaborative filtering
approaches, but they sacrifice interpretability for accuracy [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9,
16</xref>
        ]. Therefore in this work we propose a collaborative method to
generate linear user profiles in interpretable spaces that can be
inspected and eventually modified by users, without accuracy
sacrifices.
      </p>
      <p>NER
1</p>
      <p>Semantic
web
2</p>
    </sec>
    <sec id="sec-4">
      <title>3. METHODS</title>
    </sec>
    <sec id="sec-5">
      <title>3.1 Matrix Factorization Overview</title>
      <p>
        Probably, the most popular and accurate method used for product
recommendation is matrix factorization [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [15]. In this model the
rating estimation ̂ that a user would give to an item is
estimated as an affinity measure between the user and the item,
both characterized in a latent factor space with a pre-established
dimensionality f . Formally:
̂ =
→ℛ ∙
→ℛ
Where →ℛ and →ℛ denotes the characterization of user
and item in the latent factor space ℛ respectively. Here, the
used affinity measures is the dot product. If the components that
characterize the user in the latent space ℛ are denoted by
1 http://www.movielens.org
2 http://www.informatik.uni-trier.de/~ley/db
→ℛ = ,
denoted as →ℛ
can be rewritten as:
, … ,
=
where the characterization of
minimizing the prediction error
following expression:
, and the item vector components are
,
, … ,
      </p>
      <p>, then the dot product
and vectors are found
, which is calculated using the
̂ =∑
(
∙</p>
      <p>)
= !
− #(
∙
)$
To avoid overfitting, it is common to introduce a regularization
coefficient % that penalizes the norm of the user and item vectors.
Thus, the regularized prediction error &amp; is defined as:
&amp;
=
+ % ()
→ℛ ) + )
→ℛ ) *
Finally, user and item vectors are found minimizing the
regularized prediction error over the set of known ratings.
min #
. /,0 / 1/2 ∈ℝ ∧ 1/2 67
!
− #(
∙</p>
      <p>
        )$
+ % ()
→ℛ ) + )
→ℛ ) *
In this expression, we organize the known ratings in the matrix
ℝ.×0 , of size × , where is the number of users and is
the number of items. In this matrix, unknown ratings are
assigned to 0, and known ratings are in the interval [
        <xref ref-type="bibr" rid="ref1 ref5">1, 5</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>3.2 Proposed Models</title>
      <sec id="sec-6-1">
        <title>3.2.1 A Generic User Profiling Model</title>
        <p>
          In spite of the fact that it could be considered incorrect3, we will
use the canonical form of matrix factorization to express the
matrix of estimated ratings ℝ9.×0 as an affinity measure between
the user profile matrix :.× and the item profile matrix ;0× ,
both characterized in the same latent factor space. Thus:
ℝ9.×0
= :.×
∙ ;0×
Now, we can generalize this affinity measure to any space of
dimension &lt; —denoted by ℛ=— using the expression:
3 It is important to keep in mind that, in order to calculate the
approximation of :.× and ;0× matrices, ratings = 0 must
be ignored in the expression to minimize. This is why in the
recommendation study area, instead of using already implemented
matrix decomposition methods, it is preferable to use optimization
methods such us LBFGSB [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. In these methods, the unknown
ratings are expressly filtered from the training matrix ℝ.×0 .
Henceforth, the matrix notation will be used given the conceptual
simplicity that it provides for the further discussion. However, all
matrix factorizations will ignore unknown ratings .
Where :.×= is the ℛ=-based user profile matrix and ;0×= is the
ℛ=-based item profile matrix. The matrix of user profiles in the
space ℛ=, :.×= , of size × &lt; can also be denoted as:
.=
→ℛE
⋮
        </p>
        <p>F
.→ℛ E
Where G represent the affinity coefficient between the user
and the HIJ dimension in the space ℛ=, for values of in
K1, . . , N and values of H in K1, . . , &lt;N. In that notation, the vector
→ℛE is the X-based user profile of user in the space ℛ=.
Similarly, the ℛ=-based user profile matrix ;0×= can be denoted
as:
;0×= = ? ⋮
0
=
⋮ C = D
0=
→ℛE
⋮</p>
        <p>F
0→ℛ E
in K1, . . , &lt;N.
space ℛ=.</p>
        <p>Where G denotes the relevance coefficient of the item to the
HIJ dimension in the space ℛ=, for values of in K1, . . , N and H
→ℛE represents the profile of the item m in the
Now, if we choose an interpretable space ℛ= in which the item
profile matrix ;0×= can be directly calculated, then all the user
profiles in :.×= can be obtained by the following expression:
:.×= = ℝ.×0 ∙ ((;0×= ) )O
Where ((;0×= ) )O denotes the pseudo-inverse [18] of the
transposed item profile matrix characterized in ℛ=, and ℝ.×0 is
the matrix of known ratings.</p>
      </sec>
      <sec id="sec-6-2">
        <title>3.2.2 SAW User Profiles</title>
        <p>Once the user profiles are obtained the estimated ratings ̂
be calculated with the expression:
can
̂
=
#</p>
        <p>G ∙</p>
        <p>G</p>
        <p>
          G∈K ,…,|=|N
Therefore, from the point of view of decision making, it has the
well-known canonical form of the simple additive weighting
method (SAW) for multi-attribute decision making [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. In this
model, a linear discriminative function is used to appraise each
resource assigning a value (weight) to each alternative.
Alternatives with higher values are preferred over alternatives
with lower values. Studies in the area [30], [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], [27] have shown
that the intuitiveness of the SAW method makes it more
preferable, for user direct interaction, than other less interpretable
non-linear methods.
        </p>
        <p>Thus, our proposed model, behave as a SAW model for decision
making where: i) the appraisal of the resource is the rating of the
resource ̂ ; ii) ratings are expressed as a weighted linear
combination of the resource features in the interpretable space
ℛ=; and iii) weights or the affinity coefficients G are discovered
by the proposed model.</p>
        <p>In the following subsections 3.2.3 and 3.2.4, we will explain how
this generic model can be applied in two different interpretable
spaces, namely keywords and tags. Besides, we will also show
how the proposed user profiles : can be used in combination with
the matrix factorization model to obtain rating predictions (see
subsection 3.2.5). To clarify the notation used in the following
sections, we will replace &lt; for the specific size (dimensionality)
of the space in which we will focus the discussion. Thus, ℛQ will
be used instead of ℛ=, to denote he space of keywords.
Similarity, in subsection 3.2.4, the space defined by the tags will
be denoted by ℛ .</p>
      </sec>
      <sec id="sec-6-3">
        <title>3.2.3 Keyword-based User Profiles</title>
        <p>As mentioned before, the proposed model that automatizes the
process of construction of user profiles relies (in turn) in the
construction of the item profiles. Therefore, the matrix :.×Q
(keyword-based user profiles) is calculated using the matrices
;0×Q (keyword-based item profiles) and ℝ.×0 (known ratings)
using the following expression:
:.×Q</p>
        <p>= ℝ.×0 ∙ ((;0×Q ) )O
Most of the content-based approaches that build keyword-based
item profiles [16] use the vector space model [20] for representing
the textual descriptions of the items as vectors →ℛR.
Components of this vector, denoted by S, are values that
quantify the relevance of the word w to the item m. Thus, a value
close to 0 indicates that the word is not relevant to the item.
Negative values can also be used if polarized relevance scores are
available.</p>
        <p>These relevance scores can be inferred from the occurrences of
the words in the collection of textual descriptions of the items.
The common practice to obtain relevance scores is to use the
popular tf-idf term weighting scheme [14] or weights derived
from the Okapi BM-25 retrieval formula [19]. These techniques
prevent that common words get high relevance scores and
promote less frequent words that occur systematically in particular
textual descriptions.</p>
      </sec>
      <sec id="sec-6-4">
        <title>3.2.4 Tag-based User Profiles</title>
        <p>Analogously to the keyword-based profiles, the :.× matrix with
the tag-based user profiles is calculated in the same way:
:.×</p>
        <p>= ℝ.×0 ∙ ((;0× ) )O
is the matrix with tag-based item profile vectors</p>
        <sec id="sec-6-4-1">
          <title>Where ;0×</title>
          <p>0→ℛ T, in which the individual
relevance of the tag t to the item m.</p>
        </sec>
        <sec id="sec-6-4-2">
          <title>I entries indicate the</title>
          <p>
            The tag-based item profiles →ℛT can be obtained using several
techniques [16], [29]. The simplest approach consists in an item
profile based on Boolean occurrences. That is, set I = 1 when
the tag U has been applied to the item and I = 0 otherwise.
It is important to note that the proposed method to obtain the
tagbased user profiles, using the pseudo-inverse, is equivalent to a
linear regression. Therefore, the tags should be independent
among them. That independence can be promoted grouping tags
that are morphologically related using stemmers and lemmatizers.
Lops et al. [17] went beyond grouping tags semantically related
using WordNet synsets [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ].
          </p>
          <p>Item profiles with graded, instead of Boolean relevance scores can
be obtained with more sophisticated methods. For instance, Vig et
al. (2012) [26] obtained the tag genome—a tag-based item profile
for movies—by training a support vector regressor [24]. The
training data came from a survey applied to users from the
MovieLens system. The users where asked to estimate the
relevance of the tags applied on selected movies. With these
answers and a set of features extracted from movie reviews,
textual descriptions, metadata and tag applications, among others,
they trained a regressor whose predictions were used as relevance
scores.</p>
        </sec>
      </sec>
      <sec id="sec-6-5">
        <title>3.2.5 Hybrid and Updatable Rating Estimation</title>
        <p>The proposed method for generating the rating predictions is a
combination of matrix factorization (subsection 3.1) and the user
profiles proposed in subsections 3.2.3 and 3.2.4. The aim of the
method is three fold. First, we look for rating predictions as good
as the ones produced by matrix factorization. Second, the method
should be hybrid, that is, a combination of the collaborative
filtering approach of matrix factorization and the content
information from keywords or tags. Third, the users should be
able to edit their keyword-based (or tag-based) user profiles and
the rating predictions must be updated with little computational
cost. The method comprises four steps:
1.
2.
3.
4.</p>
        <p>An initial matrix of rating estimations is obtained using
matrix factorization: ℝ9.7×0 = :.× ∙ ;0× .</p>
        <p>An initial matrix of keyword-based user profiles is obtained:
:.7×Q = ℝ9.7×0 ∙ ((;0×Q ) )O .</p>
        <p>The matrix V.×Q , containing users edition operations to
their profiles (positive of negative differences) is added to
obtain updated user profiles: :.×Q = :.7×Q + V.×Q .</p>
        <p>
          Estimations are obtain by: ℝ9.×0 = :.×Q ∙ (;0×Q )
These four steps can be expressed in a single expression:
Note that ((;0×Q ) )O ∙ (;0×Q ) ≅ X0×0 (the identity
matrix) only when the item profiles are linearly independent
among them. The contrary is the common case. Thus, this matrix
multiplication infers the affinities among the items induced by the
keywords content information. In a final post-processing step, the
values on each row in the output matrix ℝ9.×0 are standardized in
the interval [
          <xref ref-type="bibr" rid="ref1">−1,1</xref>
          ]. The final rating predictions are obtained
adding to each estimated rating the average rating of the movie
and the user’s bias. The user bias is the average deviation of the
user’s ratings against the average of the entire set of ratings. The
rating estimation using tag-based user profiles is the same but
replacing ;0×Q by ;0× .
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>4. EXPERIMENTATION</title>
      <p>The experiments aim to evaluate the accuracy of the
recommendations produced by the proposed methods. This
section contains a comprehensive description of the data and the
evaluation measure used to compare the proposed models against
baselines.
4.1 Data
This subsection is intended to provide insight about how the used
dataset was obtained and preprocessed. Besides we provide
information about its content, size and distribution.</p>
      <sec id="sec-7-1">
        <title>4.1.1 Movies Collaborative Data</title>
        <p>
          The dataset of users, movies and ratings was obtained from a
production database dump of the MovieLens system in April
2012. From this dataset, we extracted a subset filtering by the
users and movies with more than 1,000 ratings. This filtering
produced a subset of 200 users, 1,462 movies and 150,915 ratings.
The rating scale in MovieLens is in the usual interval [
          <xref ref-type="bibr" rid="ref1 ref5">1,5</xref>
          ],
having 5 as the maximum grade of preference. The distribution of
ratings in our dataset is shown in Figure 1. The average number of
ratings per movie is 101.6 (σ = 37.5), and per user is 742.5
(σ = 188.5).
        </p>
      </sec>
      <sec id="sec-7-2">
        <title>4.1.2 Textual Descriptions of the Movies</title>
        <p>Textual descriptions were obtained from the synopsis field in the
movie records from the Netflix public API4 during the year 2012.
These texts were assigned to movies in the MovieLens dataset by
a mapping obtained through a research collaboration with the
GroupLens5 research group.</p>
        <p>These textual descriptions were represented in a vectorial
bag-ofwords model. The dimensionality of that representation was
reduced with the aim of obtaining a vocabulary based on
popularity and informativeness. Thus, a vocabulary of 5,848
words was obtained using the following series of preprocessing ad
hoc actions: (1) all characters were converted to lowercase
equivalents; (2) people first and last names were concatenated
with the underscore character; (3) numeric tokens were removed;
(4) 334 stop words taken from the source code of the gensim6
framework were removed; (5) words occurring in less than 10
synopses and in more than the 95% of the synopses, were
removed; and finally (6) all punctuation marks were cleaned.
The term weights used to register the relevance of a word in a
synopsis vector were obtained with the Okapi BM25 retrieval
formula [19] using the method proposed by Vanegas et al. [25].
Thus, the weight `(a, b) of a word a in a document (synopsis) b
is given by:
`(a, b) = cde f
− bg(a) (i + 1)Ug(a, b)</p>
        <p>h
j = i k(1 − l) + l
j + Ug(a, b)
bc(b)</p>
        <p>o
mnbc
Where, bg(a) is the number of documents where a occurs,
= 1,462 is the number of movies, Ug(a, b) the number of
occurrences of word a in the document b, and mnbc = 33is the
average document length. The additional used parameters were
i = 1.2 and l = 0.75 (see [e]). A pair of examples of the
resulting keyword vectors using the proposed method is shown in
Table 2. The aggregation of vectors obtained from synopses
produce the items profile matrix ;0×Q , whose dimensions are
= 1,462 movies (rows) by s = 5,848 words (columns). This
matrix is sparse, having only 0.518% of non-zero entries.</p>
      </sec>
      <sec id="sec-7-3">
        <title>4.1.3 Social Tags</title>
        <p>The tag set used to characterize the movies is the selection of tags
proposed by Vig et al. in “The Tag Genome” [26]. This tag set is a
subset of 1,128 tags out of nearly 30,000 unique tags freely
applied by 416 users in the MovieLens system. This subset was
obtained by removing tags with less than 10 applications,
misspellings, people names and near duplicates. Thereafter, they
selected the top 5% ranked tags with and entropy-based quality
measure proposed by Sen et al. [22]. Only 1,081 tags from the tag
genome’s set occurred in the 1,462 movies in the item-profile
matrix ;0× .</p>
        <sec id="sec-7-3-1">
          <title>4 http://developer.netflix.com 5 http://www.grouplens.org 6 http://radimrehurek.com/gensim</title>
          <p>There are 13,332 tag associations to the movies considered in this
study. 1,370 movies have at least one tag associated with an
average of 9.7 tags per movie (σ = 8.5). Besides, all tags were
assigned at least to one movie. The distribution of the tag
applications is considerably more uniform than the Zipf
distribution. Thus, the 108 more frequent tags (10%) represent
only the 42% of the tag associations. This can be roughly seen in
Table 3, which shows tag samples selected from uniformly
separated rank ranges. The association of movies and tags produce
the items profile matrix ;0× (1,462 movies by 1,082 tags) with
binary entries and a density of 0.844% (also very sparse).
12,989
43,068
55,025
27,193
10,229
To evaluate the performance of the proposed methods we
provided two scenarios of validation in 10 folds: cross validation
and product-cold-start [24]. In the cross validation scenario, the
ratings were divided in ten randomized folds. In each fold 90% of
ratings were used for training and the remaining 10% was used for
testing. In the product-cold-start scenario, the procedure for
extracting the training and test datasets is the same, but all the
ratings from the movies in the test set are removed.</p>
          <p>The evaluation measure to assess the accuracy of the
recommendations is root-mean-square error (RMSE) defined as:
{ ∑K1/2 N∈IxyI( ̂
t uv = w
−</p>
          <p>
            )
|U zU|
Where U zU is the test set of the ratings and |U zU| its cardinality.
Given that the methods proposed in section 3 provide rating
estimations standardized in [
            <xref ref-type="bibr" rid="ref1">−1,1</xref>
            ] interval, ̂ is obtained
adding to these estimation the average of all the training ratings
and the user’s bias. Similarly, the baseline for the cold-start test
scenario is a simple recommender system that predicts ratings
based only on the average of all the training ratings plus the user’s
bias. The baseline method for the “warm”-start scenario is the
recommender system based on matrix factorization presented in
subsection 3.1. In all experiments, the number of latent factors
was set to 30, % = 0.07 and the objective function was minimized
using the LBFGSB optimization method [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ].
instead of ℝ9.7×0
skipped.
          </p>
          <p>Note that the matrix factorization method cannot be applied in the
cold-start scenario because movies without ratings cannot be
represented in the latent factors space. Consequently, for this
scenario, the method proposed in subsection 3.2.5 uses ℝ.×0
in the second step and the first step must be</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>5. RESULTS AND DISCUSSION</title>
    </sec>
    <sec id="sec-9">
      <title>5.1 Recommendations Accuracy</title>
      <p>The results of our experiments are presented in Table 4. The first
two rows show the results for the proposed baseline methods for
each one of our test settings. The remaining two rows show the
results obtained by the proposed methods presented in subsection
3.2.4. For each system, the “RMSE” columns present the average
for the 10 folds and the columns labeled with
"σ" reports the standard deviation.</p>
      <p>METHOD
System average+user’s bias
Matrix factorization
Keyword-based user prof.</p>
      <p>Tag-based user profiles
Regarding the “warm” scenario (i.e. cross validation), the
obtained results show that the two proposed methods based on
user profiles outperformed the baseline matrix factorization
method. Particularly, the margin obtained by the keyword-based
user profile system was clearly significant, being more than 3
standard deviations apart. Clearly, the proposed methods reached
a performance level in the state of the art for the rating prediction
task. Unlike matrix factorization, our recommendations were
produced by a fully interpretable model suitable for better user
interaction and better explanations.</p>
      <p>The cold-start evaluation setting was clearly more challenging.
Our systems barely overcame the proposed average-based
baseline. However, the proposed tag and keyword-based systems
have the potential to provide to the user mechanisms to get the
system “warmer” with little effort. Accurate methods such as
matrix factorization require a considerable number of initial
ratings before starting to produce good predictions. In contrast,
our methods provide a completely customizable user profile with
just a small number of initial ratings.</p>
      <p>Comparing the tag-based and keyword-based models, the results
show that keyword-based user profiling performs better in
“warm” conditions and slightly better in “cold” conditions</p>
    </sec>
    <sec id="sec-10">
      <title>5.2 Visualizing User Profiles</title>
      <p>In order to visualize the profiles, we selected the User 156 from
the fold 1 in our dataset. We must say that users in our data are
completely anonymous. This user was manually chosen based on
the user-to-user pairwise Pearson correlation matrix obtained from
the keyword-based user profiles :.×Q . Comparing these
correlations we observed that the User 156 had high negative and
positive correlations against the other users. So, we considered
that the preferences and dislikes of this user was being shared by
several users and rejected by others. Consequently, we considered
him as an interesting candidate to be visualized. In Figure 2, the
keyword-based user profile of the User 156 is showed jointly with
his 10-nearest users according to the user-to-user correlation
matrix. The ranked list of keywords that this user prefers the most
is shown on the left side. The right side shows the list of his most
disliked keywords. The user profile is represented by the thick
black line. In its turn Figure 3, shows the same plots but using
tagbased user profiles instead of keywords.</p>
      <p>Now it is possible to qualitatively compare a user keyword-based
versus a tag-based profile. From this comparison we observe that
User 156’s tag-based profile is more cohesive in comparison with
the word-based profile. This cohesiveness can be observed by the
semantic relatedness of the tag set. In this profile, 20 out of 40
tags preferred by User 156 are related to action and teens movies.
These tags are: Dark hero, Effects, Explosions, Indiana jones,
German, Drug addiction, Arms dealer, Weapons, Life &amp; death,
Videogame, First contact, Comic book adapt, Bond, 007 series,
Stop motion, Fantasy world, Dreamworks, Video games, Harry
potter, Emma Watson. Regarding the keyword-based profile, the
keyword set doesn’t exhibit a clear pattern. Although we know
that these particular observations cannot be generalized, we think
that this observation opens an interesting research direction about
the necessity of measuring the semantic cohesiveness of the
produced profiles.</p>
      <p>Concerning the potential of interaction we have not yet conducted
any experiments with users, but it seems reasonable that users will
understand the general interaction idea. It is expected that the
users will be prone to experiment modifying their own profiles
varying the level of preference or dislike for the more relevant
tags or keywords in their profiles. Also, it seems that the feature
of seeing the profile of similar users could motivate the desire to
interact with the interface. That is because, showing other people
behaviors and allows a kind of warm start with the system.
New concerns arise from the observations of these profiles. For
instance, what should be done with “negative” tags that appear in
the list of preferred tags of users? This situation is illustrated by
the tag “boring!” in the User 156’s “likes” list.</p>
      <p>
        Probably, this tag can be reasonable and predictive for some users,
so, maybe it shouldn’t be removed from the tag set. But trenchant
criticisms of user tastes should be prevented. A possible
alternative to this problem would be the use of a linear regression
algorithm, similar to the one used in a previous work [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], that
could estimate a weight for each tag for knowing if the tag is
intrinsically positive or negative. Thus, if a tag has a negative
connotation we could filter it from the list of “liked” tags.
In the warm-start scenario, when the keyword-based profiling and
the tag-based profiling methods are compared, it was observed
that keyword-based method was considerably more accurate than
the matrix factorization method. The RMSE decremented by a
5.63% (more than 3 times σ), while the difference in the error
with the tab-based method was only 1.00%. Consequently, it is
possible to say that the proposed keyword-based method is able to
improve the matrix factorization approach.
      </p>
    </sec>
    <sec id="sec-11">
      <title>6. CONCLUSIONS</title>
      <p>We proposed a generic method to extract user profiles, in
interpretable spaces, in which it is possibly to directly characterize
items from the collection. The proposed user-profiling methods
were indexed in two different spaces: keywords and tags. Besides
the proposed models are suitable for user interaction in the user
profile component.</p>
      <p>The proposed user-profiling methods were evaluated in a subset
of the MovieLens dataset and compared against strong baselines.
It was concluded that in “warm” scenarios both methods produce
recommendations with the same accuracy than those produced by
matrix factorization methods. In a cold-start scenario, both
methods performed slightly better than a recommender system
based on average ratings.
Regarding the proposed visualization of the keyword-based and
the tag-based user profiles, we could observe that cohesion of the
profile is an important measure to have into account when two
profiles methods are compared. Non-cohesive profiles might be
misunderstood by users leading them to avoid the interaction with
those profiles. An interesting research question could be how to
discriminate cohesive profiles, from non-cohesive profiles.
The proposed approach also contributed to a better classification
of the content-based recommendation techniques, separating the
user-profiling task from the item-profiling task, suggesting a
uniform framework to share and compare the contributions made
on each one of the tasks.</p>
    </sec>
    <sec id="sec-12">
      <title>7. ACKNOWLEDGMENTS</title>
      <p>Our especial thanks to Prof. John Riedl and Daniel Kluver from
GroupLens, the University of Minnesota; Prof. Shilad Sen of the
Macalester College; and Prof. Fabio Gonzalez of the Universidad
Nacional de Colombia. The work was partially funded by the
Colombian Department for Science, Technology and Innovation
(Colciencias) via the grant 1101-521-28465 from “El Patrimonio
Autónomo Fondo Nacional de Financiamiento para la Ciencia, la
Tecnología y la Innovación, Francisco José de Caldas” and by the
Universidad Nacional de Colombia via the grant DIB QUIPU:
201010016956. The third author recognizes the support from
Mexican Government (SNI, COFAA-IPN, SIP 20131702,
CONACYT 50206-H) and CONACYT–DST India (grant 122030
“Answer Validation through Textual Entailment”).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Adomavicius</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manouselis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kwon</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Multi-Criteria Recommender Systems</article-title>
          .
          <source>Recommender Systems Handbook</source>
          .
          <fpage>769</fpage>
          -
          <lpage>803</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Man</given-names>
            <surname>Au Yeung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Gibbins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            and
            <surname>Shadbolt</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <year>2008</year>
          .
          <article-title>A Study of User Profile Generation from Folksonomies</article-title>
          .
          <source>SWKM</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Becerra</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gelbukh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Visualizable and Explicable Recommendations Obtained from Price Estimation Functions</article-title>
          .
          <article-title>Proceedings of the Human Decision Making in Recommender Systems (</article-title>
          <year>2011</year>
          ),
          <fpage>27</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Bell</surname>
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koren</surname>
            <given-names>Y</given-names>
          </string-name>
          . and
          <string-name>
            <surname>C</surname>
          </string-name>
          , V.
          <year>2007</year>
          .
          <article-title>The BellKor solution to the Net Flix Prize</article-title>
          .
          <source>Technical report</source>
          , AT&amp;T Labs Research. (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Byrd</surname>
            ,
            <given-names>R.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nocedal</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>1995</year>
          .
          <article-title>A limited memory algorithm for bound constrained optimization</article-title>
          .
          <source>SIAM J. Sci. Comput</source>
          .
          <volume>16</volume>
          ,
          <issue>5</issue>
          (Sep.
          <year>1995</year>
          ),
          <fpage>1190</fpage>
          -
          <lpage>1208</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Diederich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Iofciu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Finding Communities of Practice from User Profiles Based On Folksonomies</article-title>
          .
          <source>Proceedings of the 1st International Workshop on Building Technology Enhanced Learning solutions for Communities of Practice</source>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Fellbaum</surname>
          </string-name>
          , C. ed.
          <year>1998</year>
          .
          <article-title>WordNet An Electronic Lexical Database</article-title>
          . The MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>De</given-names>
            <surname>Gemmis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Lops</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Semeraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            and
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <year>2008</year>
          .
          <article-title>Integrating tags in a semantic content-based recommender</article-title>
          .
          <source>Proceedings of the 2008 ACM conference on Recommender systems</source>
          (New York, NY, USA,
          <year>2008</year>
          ),
          <fpage>163</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Guan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>He</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <year>2010</year>
          .
          <article-title>Document recommendation in social tagging services</article-title>
          .
          <source>Proceedings of the 19th international conference on World wide web (</source>
          New York, NY, USA,
          <year>2010</year>
          ),
          <fpage>391</fpage>
          -
          <lpage>400</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Hassan-Montero</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Herrero-Solana</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Improving tag-clouds as visual information retrieval interfaces</article-title>
          .
          <source>International Conference on Multidisciplinary Information Sciences and Technologies</source>
          (
          <year>2006</year>
          ),
          <fpage>25</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Herlocker</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konstan</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Riedl</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2000</year>
          .
          <article-title>Explaining collaborative filtering recommendations</article-title>
          . (
          <year>2000</year>
          ),
          <fpage>241</fpage>
          -
          <lpage>250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Hwang</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Yoon</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          <year>1981</year>
          .
          <article-title>Multiple Attribute Decision Making. Methods and Applications</article-title>
          . SpringerVerlag, NY. (
          <year>1981</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Hwang</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Yoon</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          <year>1981</year>
          .
          <article-title>Multiple Attribute Decision Making. Methods and Applications</article-title>
          . SpringerVerlag, NY. (
          <year>1981</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          <year>1972</year>
          .
          <article-title>A statistical interpretation of term specificity and its application in retrieval</article-title>
          .
          <source>Journal of Documentation</source>
          .
          <volume>28</volume>
          , (
          <year>1972</year>
          ),
          <fpage>11</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Koren</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bell</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Volinsky</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Matrix Factorization Techniques for Recommender Systems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          Computer.
          <volume>42</volume>
          ,
          <issue>8</issue>
          (Aug.
          <year>2009</year>
          ),
          <fpage>30</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Lops</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gemmis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Semeraro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Contentbased Recommender Systems: State of the Art and Trends</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Shapira</surname>
          </string-name>
          , and P.B. Kantor, eds. Springer US.
          <volume>73</volume>
          -
          <fpage>105</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Lops</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gemmis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Semeraro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Narducci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bux</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>A Semantic Content-Based Recommender System Integrating Folksonomies for Personalized Access</article-title>
          .
          <article-title>Web Personalization in Intelligent Environments</article-title>
          . G. Castellano,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Fanelli, eds.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          Springer Berlin Heidelberg. 27-
          <fpage>47</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Penrose</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Todd</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          <article-title>On best approximate solutions of linear matrix equations</article-title>
          .
          <source>Mathematical Proceedings of the Cambridge Philosophical Society. null</source>
          ,
          <volume>01</volume>
          ,
          <fpage>17</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Robertson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2005</year>
          .
          <article-title>How Okapi Came to TREC. TREC: Experiment in Information Retrieval</article-title>
          . MIT Press.
          <volume>287</volume>
          -
          <fpage>300</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Salton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>A.K.C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Yang</surname>
          </string-name>
          , C.-S.
          <year>1975</year>
          .
          <article-title>A vector space model for automatic indexing</article-title>
          .
          <source>Commun</source>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Schein</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennock</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <article-title>Ungar 2002</article-title>
          .
          <article-title>Methods and metrics for cold-start recommendations</article-title>
          .
          <source>SIGIR</source>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Sen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harper</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>LaPitz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Riedl</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>The quest for quality tags</article-title>
          .
          <source>Proceedings of the 2007 International ACM Conference on Supporting Group Work</source>
          (
          <year>2007</year>
          ),
          <fpage>361</fpage>
          -
          <lpage>370</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Sen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vig</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Riedl</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Learning to recognize valuable tags</article-title>
          .
          <source>Proceedings of the 13th International Conference on Intelligent User Interfaces (Sanibel Island</source>
          , Florida, USA,
          <year>2009</year>
          ),
          <fpage>87</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>Smola</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Schölkopf</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>1998</year>
          .
          <article-title>A Tutorial on Support Vector Regression,</article-title>
          . Royal Holloway College, London, U.K.,
          <source>NeuroCOLT Tech. Rep</source>
          ..
          <source>TR 1998-030</source>
          ,
          <year>1998</year>
          . (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>Vanegas</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caicedo</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Camargo</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>RamosPollán</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          <year>2012</year>
          . Bioingenium at ImageCLEF 2012:
          <article-title>Textual and Visual Indexing for Medical Images</article-title>
          . CLEF (Online Working Notes/Labs/Workshop) (Rome, Italy,
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>Vig</surname>
            , Jesse, Sen,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Riedl</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>The Tag Genome: Encoding Community Knowledge to Support Novel Interaction</article-title>
          .
          <source>ACM Transactions on Interactive Inteligent Systems. 2</source>
          ,
          <issue>3</issue>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>Yeh</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          <year>2002</year>
          .
          <article-title>A problem based selection of multiattribute decision-making methods</article-title>
          .
          <source>International Transactions in Operational Research</source>
          .
          <volume>9</volume>
          ,
          <issue>2</issue>
          (Mar.
          <year>2002</year>
          ),
          <fpage>169</fpage>
          -
          <lpage>181</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <surname>Yoon</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Hwang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>1995</year>
          .
          <article-title>Multiple Attribute Decision Making. An introduction</article-title>
          . Sage university papers series, no.
          <fpage>07</fpage>
          -
          <lpage>104</lpage>
          . Thousand Oaks, CA: Sage Publications.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.-K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and Zhang, Y.-C.
          <year>2011</year>
          .
          <article-title>Tag-Aware Recommender Systems: A State-of-the-Art Survey</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <source>Journal of Computer Science and Technology</source>
          .
          <volume>26</volume>
          ,
          <issue>5</issue>
          (Sep.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <surname>Zopounidis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Doumpos</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>2002</year>
          .
          <article-title>Multicriteria classification and sorting methods: A literature review</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <source>European Journal of Operational Research</source>
          .
          <volume>138</volume>
          ,
          <issue>2</issue>
          (Apr.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>