=Paper=
{{Paper
|id=Vol-1245/paper6
|storemode=property
|title=Exploiting Social Tags in Matrix Factorization Models for Cross-domain Collaborative Filtering
|pdfUrl=https://ceur-ws.org/Vol-1245/cbrecsys2014-paper06.pdf
|volume=Vol-1245
|dblpUrl=https://dblp.org/rec/conf/recsys/Fernandez-TobiasC14
}}
==Exploiting Social Tags in Matrix Factorization Models for Cross-domain Collaborative Filtering==
Exploiting Social Tags in Matrix Factorization Models for
Cross-domain Collaborative Filtering
Ignacio Fernández-Tobías, Iván Cantador
Escuela Politécnica Superior
Universidad Autónoma de Madrid
28049 Madrid, Spain
{ignacio.fernandezt, ivan.cantador}@uam.es
ABSTRACT Even though the majority of recommender systems focus on a
Cross-domain recommender systems aim to generate or enhance single domain or type of item, there are cases in which providing
personalized recommendations in a target domain by exploiting the user with cross-domain recommendations could be beneficial.
knowledge (mainly user preferences) from other source domains. For instance, large e-commerce sites like Amazon4 and eBay5
Due to the heterogeneity of item characteristics across domains, collect user feedback for items from multiple domains, and in
content-based recommendation methods are difficult to apply, and social networks users often share their tastes and interests on a
collaborative filtering has become the most popular approach to variety of topics. In these cases, rather than exploiting user
cross-domain recommendation. Nonetheless, recent work has preference data from each domain independently, recommender
shown that the accuracy of cross-domain collaborative filtering systems could exploit more exhaustive, multi-domain user models
based on matrix factorization can be improved by means of content that allow generating item recommendations spanning several
information; in particular, social tags shared between domains. In domains. Furthermore, exploiting additional knowledge from
this paper, we review state of the art approaches in this direction, related, auxiliary domains could help improve the quality of item
and present an alternative recommendation model based on a novel recommendations in a target domain, e.g. addressing the cold-start
extension of the SVD++ algorithm. Our approach introduces a new and sparsity problems [7].
set of latent variables, and enriches both user and item profiles with These benefits rely on the assumption that there are similarities or
independent sets of tag factors, better capturing the effects of tags relations between user preferences and/or item attributes from
on ratings. Evaluating the proposed model in the movies and books different domains. When such correspondences exist, one way to
domains, we show that it can generate more accurate exploit them is by aggregating knowledge from the involved
recommendations than existing approaches, even in cold-start domain data sources, for example by merging user preferences
situations. into a unified model [1], and by combining single-domain
recommendations [3]. An alternative way consists of transferring
Categories and Subject Descriptors knowledge from a source domain to a target domain, for example
H.3.3 [Information Storage and Retrieval]: Information Search and by sharing implicit latent features that relate source and target
Retrieval – information filtering. G.1.3 [Numerical Analysis]: domains [15][17], and by exploiting implicit rating patterns from
Numerical Linear Algebra – singular value decomposition. source domains in the target domain [9][14].
In either of the above cases, most of the existing approaches to
General Terms cross-domain recommendation are based on collaborative
Algorithms, Performance, Experimentation. filtering, since it merely needs rating data, and does not require
information about the users’ and items’ characteristics, which are
Keywords usually highly heterogeneous among domains.
Recommender systems, collaborative filtering, cross-domain
However, inter-domain links established through content-based
recommendation, social tagging.
features and relations may have several advantages, such as a
better interpretability of the cross-domain user models and
1. INTRODUCTION recommendations, and the establishment of more reliable methods
Recommender systems [2] have been successfully used in to support the knowledge transfer between domains. In particular,
numerous domains and applications to identify potentially social tags assigned to different types of items –such as movies,
relevant items for users according to their preferences (tastes, music albums, and books–, may act as a common vocabulary
interests and goals). Examples include suggested movies and TV between domains [6][17]. Hence, as domain independent content-
programs in Netflix1, music albums in Last.fm2, and books in based features, tags can be used to overcome the information
Barnes&Noble3. heterogeneity across domains, and are suitable for building the
above mentioned inter-domain links.
1
Netflix online movies & TV shows provider, http://www.netflix.com In this paper, we review state of the art cross-domain
2
Last.fm music discovery service, http://www.lastfm.com recommendation approaches that utilize social tags to exploit
3
Barnes&Noble retail bookseller, http://www.barnesandnoble.com knowledge from an auxiliary source domain for enhancing
Copyright 2014 for the individual papers by the paper’s authors.
collaborative filtering rating predictions in a target domain.
Copying permitted for private and academic purposes. This volume is
4
published
CBRecSysand copyrighted
2014, October 6,by its editors.
2014, Silicon Valley, CA, USA. Amazon e-commerce website, http://www.amazon.com
5
CBRecSys
Copyright 2014,
remainsOctober
with the6,authors
2014, Silicon Valley, copyright
and/or original CA, USA.holders. eBay consumer-to-consumer website, http://www.ebay.com
34
Specifically, we focus on several extensions of the matrix those that link or transfer knowledge between domains to support
factorization technique proposed in [6], which incorporates latent recommendations in the target domain.
factors related to the users’ social tags. By jointly learning tag The knowledge aggregation methods merge user preferences (e.g.
factors in both the source and target domains, hidden correlations ratings, social tags, and semantic concepts) [1], mediate user
between ratings and tags in the source domain can be used in the modeling data exploited by various recommender systems (e.g. user
target domain. Hence, for instance, a movie recommender system similarities and user neighborhoods) [3][16], and combine single-
may estimate a higher rating for a particular movie tagged as domain recommendations (e.g. rating estimations and rating
interesting or amazing if these tags are usually assigned to books probability distributions) [3]. The knowledge linkage and transfer
positively rated. Also, books tagged as romantic or suspenseful methods relate domains by common information (e.g. item
may be recommended to a user if it is found that such tags attributes, association rules, semantic networks, and inter-domain
correlate with high movie ratings. correlations) [5][18], share implicit latent features that relate source
Enrich et al. [6] presented several recommendation models that and target domains [15][17], and exploit explicit or implicit rating
exploit different sets of social tags when computing rating patterns from source domains in the target domain [9][14].
predictions, namely tags assigned by the active user to the item for Cross-domain recommendation models based on latent factors are a
which the rating is estimated, and all the tags assigned by the popular choice among knowledge linkage and transfer methods,
community to the target item. Despite their good performance, since they allow automatically discovering and exploiting implicit
these models do have difficulties in cold-start situations where no domain relations within the data from different domains. For
tagging information is available for the target user/item. instance, Zhang et al. [20] proposed an adaptation of the matrix
In this paper, we propose a method that expands the users’ and factorization model to include a probability distribution that
items’ profiles to overcome these limitations. More specifically, captures inter-domain correlations, and Cao et al. [4] presented a
we propose to incorporate additional parameters to the above method that learns similarities between item latent factors in
models, separating user and item latent tag factors in order to different domains as parameters in a Bayesian framework. Aiming
capture the contributions of each to the ratings more accurately. to exploit heterogeneous forms of user feedback, Pan et al. [15]
Furthermore, by modeling user and item tags independently we proposed an adaptive model in which the latent features learned in
are able to compute rating predictions even when a user has not the source domain are transferred to the target domain in order to
assigned any tag to an item, or for items that have not been tagged regularize the matrix factorization there. Instead of the more
yet. For such purpose, we adapt the gSVD++ algorithm [10] – common two-way decomposition of the rating matrix, Li et al. [14]
designed to integrate content metadata into the matrix used a nonnegative matrix tri-factorization to extract rating patterns
factorization process– for modeling social tags in the cross- –the so-called codebook– in the source domain. Then, rather than
domain recommendation scenario. transferring user and item latent factors, the rating patterns are
shared in the target domain and used to predict the missing ratings.
Through a series of experiments in the movies and books
domains, we show that the proposed approach outperforms the Despite the ability of matrix factorization models to discover
state of the art methods, and validate the main contribution of this latent implicit relations, there are some methods that use tags as
work: A model that separately captures user and item tagging explicit information to bridge the domains. Shi et al. [17] argued
information, and effectively transfers auxiliary knowledge to the that explicit relations established through common social tags are
target domain in order to provide cross-domain recommendations. more effective for such purpose, and used them to compute user-
user and item-item cross-domain similarities. In this case, rating
The reminder of the paper is structured as follows. In section 2 we
matrices from the source and target domains are jointly factorized,
review state of the art approaches to the cross-domain
but user and item latent factors are restricted so that they are
recommendation problem, focusing on algorithms based on matrix
consistent with the tag-based similarities.
factorization, and on algorithms that make use of social tags to
relate the domains of interest. In section 3 we provide a brief Instead of focusing on sharing user or item latent factors, Enrich et
overview of matrix factorization methods for single-domain al. [6] studied the influence of social tags on rating prediction.
recommendation, and in section 4 we describe their extensions for More specifically, the authors presented a number of models based
the cross-domain recommendation case. In section 5 we present on the well-known SVD++ algorithm [11], to incorporate the effect
and discuss the conducted experimental work and obtained of tag assignments into rating estimations. The underlying
results. Finally, in section 6 we summarize some conclusions and hypothesis is that information about how users annotate items in
future research lines. the source domain can be exploited to improve rating prediction in
a different target domain, as long as a set of common tags between
2. RELATED WORK the domains exists. In all the proposed models, tag factors are
Cross-domain recommender systems aim to generate or enhance added into the latent item vectors, and are then combined with user
personalized recommendations in a target domain by exploiting latent features to compute rating estimations. The difference
knowledge (mainly user preferences) from other source domains between these models is the set of tags considered for rating
[7][19]. This problem has been addressed from various perspectives prediction. Two of the proposed models use the tags assigned by
in several research areas. It has been faced by means of user the user to a target item, and the other model takes the tags of the
preference aggregation and mediation strategies for the cross- whole community into account. We note that the first two models
system personalization problem in user modeling [1][3][16], as a require the active user to tag, but not rate the item in the target
potential solution to mitigate the cold-start and sparsity problems in domain. In all the models, the transfer of knowledge is performed
recommender systems [5][17][18], and as a practical application of through the shared tag factors in a collective way, since these
knowledge transfer techniques in machine learning [9][14][15]. factors are learned jointly for the source and the target domains.
The results reported in the movies and books domains confirmed
We can distinguish between two main types of cross-domain
that shared knowledge can be effectively exploited to outperform
approaches: Those that aggregate knowledge from various source
single-domain rating predictions.
domains to perform recommendations in a target domain, and
34
The model we propose in this paper follows the same line as 3.2 SVD++: Adding implicit user feedback to
Enrich et al. [6], in the sense that tags are directly integrated as
latent factors into the rating prediction process, as opposed to the rating matrix factorization method
Shi’s and colleagues’ approach [17], which estimates the ratings The main motivation behind the SVD++ algorithm, proposed by
using only user and item factors. The main difference of our Koren [11][13], is to exploit implicit additional user feedback for
model with the approaches presented in [6] is the way in which rating prediction, since it is arguably to use a more available and
the rating matrix is factorized. Rather than using a single set of tag abundant source of user preferences.
factors to extend the item’s factorization component, we introduce In this model, user preferences are represented as a combination
additional latent variables in the user component to separately of explicit and implicit feedback, searching for a better
capture the effect of tags utilized by the user and the tags assigned understanding of the user by looking at what items she rated,
to the item. For this purpose, we adapt the gSVD++ algorithm purchased or watched. For this purpose, additional latent factors
[10], which extends SVD++ by introducing a set of latent factors are combined with the user’s factors as follows:
to take item metadata into account for rating prediction. In this
model, both user and item factors are respectively enhanced with
̂ = + || (4)
implicit feedback and content information, which allows
∈
improving the accuracy of rating predictions.
In the previous formula, ∈ ℝ , ∈ ℝ , ∈ ℝ represent
3. OVERVIEW OF MATRIX user, item, and implicit feedback factors, respectively. is the
FACTORIZATION METHODS set of items for which the user provided implicit preference,
Since the proposed cross-domain recommendation model is built and is the number of latent features.
upon a matrix factorization collaborative filtering method, in this
Similarly to the SVD algorithm, the parameters of the model can
section we provide a brief overview of the well-known standard
be estimated by minimizing the regularized squared error loss
rating matrix factorization technique, and the SVD++ and
over the observed training data:
gSVD++ algorithms, which extend the former by incorporating
implicit user feedback and item metadata, respectively.
min − + ||
3.1 MF: Standard rating matrix factorization ∗ ,∗ ,∗
,∈ℛ ∈
Matrix factorization (MF) methods [8][12] are a popular approach (5)
to latent factor models in collaborative filtering. In these methods,
+ ‖ ‖ + ‖ ‖ +
the rating matrix is decomposed as the product of low-rank
∈
matrices of user and item latent features. In its most basic form, a
factor vector ∈ ℝ is assigned to each user , and a factor Again, the minimization problem can be efficiently solved using
vector ∈ ℝ to each item , so that ratings are estimated as: stochastic gradient descent.
̂ = + (1)
3.3 gSVD++: Adding item metadata to the
where the term is a baseline estimate that captures the rating matrix factorization method
deviation of user and item ratings from the average, and is defined The gSVD++ algorithm [10] further extends SVD++ considering
as: information about the items’ attributes in addition to the users’
= + + (2) implicit feedback.
The parameter corresponds to the global average rating in the The model introduces a new set of latent variables ∈ ℝ for
training set, and and are respectively the deviations in the metadata that complement the item factors. This idea combined
ratings of user and item from the average. The baseline with the SVD++ algorithm leads to the following formula for
estimates can be explicitly defined or learned from the data. In the computing rating predictions:
latter case, the parameters of the model are found by solving the
following regularized least squares problem: ̂ = + || + || (6)
min − − − − ∈ ∈
∗ ,∗ ,∗
,∈ℛ (3)
The set contains the attributes related to item , e.g. comedy
+ + + ‖ ‖ + ‖ ‖ and romance in the case of movie genres. The parameter is set
In this formula, the parameter controls the amount of to 1 when the set ≠ ∅, and 0 otherwise. We note that in the
regularization to prevent high model variance and overfitting. The previous formula, both user and item factors are enriched with
minimization can be performed by using gradient descent over the new uncoupled latent variables that separately capture information
set ℛ of observed ratings [8]. This method is popularly called about the users and items, leading to a symmetric model with four
SVD, but it is worth noticing that it is not completely equivalent types of parameters. Again, parameter learning can be performed
to the singular value decomposition technique, since the rating by minimizing the associated squared error function with gradient
matrix is usually very sparse and most of its entries are actually descent:
not observed.
min − + || + ||
For simplicity purposes, in the following we omit the baseline ∗ ,∗ ,∗ ,∗
,∈ℛ ∈ ∈
estimates. They, nonetheless, can be easily considered by adding (7)
the term into the rating estimation formulas. + ‖ ‖ + ‖ ‖ + +
∈ ∈
35
The use of additional latent factors for item metadata is reported prediction, and is the number of times tag was applied to
to improve prediction accuracy over SVD++ in [10]. In this paper, item . In this case, the normalization factor is || =
we adapt this model to separately learn user and item tag factors, ∑∈ .
aiming to support the transfer of knowledge between domains.
We note that the set does not depend on the user, and that
the user and item components of the factorization are fully
4. TAG-BASED MODELS FOR CROSS- uncoupled. This has the advantage that tag factors can also be
DOMAIN COLLABORATIVE FILTERING exploited in the rating predictions for new users for whom tagging
In this section, we first describe the tag-based cross-domain information is not available yet, improving over the standard
collaborative filtering models presented in [6], which are an matrix factorization method. The ItemRelTags model, however,
adaptation of the SVD++ algorithm, and next introduce our does not take into account the possibility that the user has tagged
proposed model, which is built upon the gSVD++ algorithm. different items other than the one for which the rating is being
estimated. In such cases, it may be beneficial to enrich the user’s
4.1 Adaptation of SVD++ for Tag-based profile by considering other tags the user has chosen in the past as
Cross-domain Collaborative Filtering evidence of her preferences. In the next subsection, we propose a
The main hypothesis behind the models proposed in [6] is that the model that aims to exploit this information to generate more
effect of social tags on ratings can be shared between domains to accurate recommendations.
improve the rating predictions in the target domain. In that work, Similarly to the SVD++ algorithm, all of the above models can be
three different adaptations of the SVD++ algorithm were explored trained by minimizing the associated loss function with stochastic
that utilize tags as implicit user feedback to enhance the item gradient descent.
factors, as opposed to user factors like in the original model.
The first of the algorithms proposed by Enrich et al. is the 4.2 Adaptation of gSVD++ for Tag-based
UserItemTags model, which only exploits the tags that the Cross-domain Collaborative Filtering
active user assigned to the target item : Although the previous recommendation models can successfully
transfer tagging information between domains, they suffer from
1 some limitations. The UserItemTags and UserItemRelTags models
̂ = + (8)
| | cannot do better than the standard matrix factorization if the user
∈
has not tagged the item for which the rating is being estimated,
We note here that if the user has not tagged the item, i.e., = while the ItemRelTags model does not fully exploits the user’s
∅, then the model corresponds to the standard matrix factorization preferences expressed in the tags assigned to other items.
technique. Also, even though the tag factors are only combined In this paper, we propose to adapt the gSVD++ algorithm by
with the item factors , the user and item factorization introducing an additional set of latent variables ∈ ℝ that
components are not completely uncoupled, since the set still enrich the user’s factors and better capture the effect of her tags in
depends on the user . the rating estimation. Specifically, we distinguish between two
An improvement over the model was also presented in [6], based different sets of tags for users and items, and factorize the rating
on the observation that not all the tags are equally relevant (i.e. matrix into fully uncoupled user and item components as follows:
discriminative) to predict the ratings. The proposed alternative is
to filter the tags in the set that are not relevant according to 1 1
̂ = + + (11)
certain criterion. In that work, the Wilcoxon rank-sum test is | | | |
∈ ∈
performed for each tag to decide if the mean rating significantly
changes in the presence/absence of the tag in the dataset. In this The set contains all the tags assigned by user to any item.
model, rating predictions are computed in an analogous manner: Respectively, is the set of tags assigned by any user to item ,
and plays the role of item metadata in the gSVD++
1
̂ = + (9) algorithm. As in the ItemRelTags model, there may be repeated
| | tags in each of the above tag sets, which we account for by
∈
considering the number of times a tag appears in or ,
Here, the set ⊆ only contains those tags for which respectively. In (11), is the number of items on which the user
the p-value of the abovementioned test is < 0.05. This method applied tag , and is the number of users that applied tag to
was called as UserItemRelTags. item . As previously, tag factors are normalized by | | =
As noted by the authors, the previous methods are useful when the ∑∈ and | | = ∑∈ , so that factors and do not
user has tagged but not rated an item. However, these methods do dominate over the rating factors and for users and items with
not greatly improve over the standard matrix factorization a large number of tags.
technique in the cold-start situations where new users or items are In the proposed model, which we call as TagGSVD++, a user’s
considered. Aiming to address this limitation, a last approach was profile is enhanced with the tags she used, since we hypothesize
proposed, the ItemRelTags model: that her interests are better captured, and that transferring this
information between domains can be beneficial for estimating
1 ratings in the target domain. Likewise, item profiles are extended
̂ = + (10)
|| with the tags that were applied to them, as in the ItemRelTags
∈
model.
Now, the set contains all the relevant tags assigned by the The parameters of TagGSVD++ can be learned from the observed
whole community to the item , with possible repetitions. Tags training data by solving the following unconstrained minimization
that appear more often contribute with more factors to the problem:
36
min , , ∈ , ∈ assignments by 7,279 users on 37,232 books. Ratings in both of
∗ ,∗ ,∗ ,∗
,∈ℛ the datasets are expressed on a 1-5 scale, with interval steps of
0.5.
1 1 1
= min − + + (12)
∗ ,∗ ,∗ ,∗ 2 | | | | Since we were interested in analyzing the effect of tags on rating
,∈ℛ ∈ ∈
prediction, we only kept ratings in MovieLens on movies for
which at least one tag was applied, leaving a total of 24,564
+ ‖ ‖ + ‖ ‖ + ‖ ‖ + ‖ ‖
2 ratings. Also following the setup done by Enrich et al., we
∈ ∈
considered the same amount of ratings in LibraryThing, and took
The factor 1⁄2 simplifies the following derivations with no effect
the first 24,564 ratings. We note, however, that the original
on the solution. As in the previous models, a minimum can be
dataset contained duplicate rows and inconsistencies, i.e., some
found by stochastic gradient descent. For completeness, in the
user-item pairs had more than one rating. Hence, we preprocessed
following we list the update rules of TagGSVD++ taking the
the dataset removing such repetitions and keeping only the
derivatives of the error function in (12) with respect to the repeated ratings that appeared first in the dataset’s file. We also
parameters:
converted the tags to lower case in both datasets. Table 1 shows
1 the characteristics of the final datasets.
= − + +
|| ∈ Table 1. Details of the datasets used in the experiments after
1 preprocessing.
= − + +
| | ∈ MovieLens LibraryThing
1 Users 2,026 244
= − + + ∀ ∈
| | | | ∈ Items 5,088 12,801
1 Ratings 24,564 24,564
= − + + ∀ ∈
| | | | ∈ Avg. ratings per user 12.12 100.67
Rating sparsity 99.76% 99.21%
where the error term is − ̂ . In the training phase, we Tags 9,529 4,598
loop over the observed ratings simultaneously updating the Tag assignments 44,805 72,943
parameters according to the following rules:
Avg. tag assignments per user 22.16 298.95
← − − + | | ∑∈ Ratio of overlapping (shared) tags 13.81% 28.62%
← − − + | | ∑∈ 5.2 Evaluation methodology
As mentioned above, we have compared the performance of the
← − − || + | | ∑∈ , ∀ ∈ proposed model against the single-domain matrix factorization
baselines from section 3, and the state-of-the-art tag-based
← − − || + | | ∑∈ , ∀ ∈ algorithms described in section 4.1. All these methods are
summarized next:
The learning rate determines to what extent the parameters are
updated in each iteration. A small learning rate can make the MF The standard matrix factorization method trained by
learning slow, whereas with a large learning rate the algorithm stochastic gradient descent over the observed ratings of both
may fail to converge. The choice of both the learning rate and the movies and books domains.
regularization parameter is discussed later in section 5.3. SVD++ An adaptation of MF to take implicit data into account. In
our experiments, the set contains all the items rated by user .
5. EXPERIMENTS gSVD++ An extension of SVD++ to include item metadata into
We have evaluated the proposed TagGSVD++ algorithm (section the factorization process. In our experiments, we have considered
4.2) in a cross-domain collaborative filtering setting, by as set of item attributes the tags assigned to item by any
empirically comparing it with the single-domain matrix user. Note that, as tags are content features for both movies and
factorization methods (section 3) and the state-of-the-art cross- books, this method is suitable for cross-domain recommendation,
domain recommendation approaches described in section 4.1. since knowledge can be transferred through the metadata (tag)
factors. This differs from the proposed TagGSVD++ in that users
5.1 Dataset are modeled as in SVD++ by considering rated items as implicit
We have attempted to reproduce the cross-domain dataset used in
feedback instead of their tags. Also, normalization of the implicit
[6], aiming to compare our approach with those presented in that
data factors on the user component involves a square root; see
paper. For the sake of completeness, we also describe the data
equations (6) and (11).
collection process here.
UserItemTags A method that expands an item ’s profile with
In order to simulate the cross-domain collaborative filtering
latent factors of tags that the target user assigned to . Its
setting, we have downloaded two publicly available datasets for
parameters are learned by simultaneously factorizing the rating
the movies and books domains. The MovieLens 10M dataset6
matrices of both source and target domains.
(ML) contains over 10 million ratings and 100,000 tag
assignments by 71,567 users to 10,681 movies. The LibraryThing UserItemRelTags A variation of the previous method that only
dataset7 (LT) contains over 700,000 ratings and 2 million tag takes relevant tags into account, as determined by a Wilcoxon
rank-sum test.
6
MovieLens datasets, http://grouplens.org/datasets/movielens
7
LibraryThing dataset, http://www.macle.nl/tud/LT
37
Figure 1.. Data splitting done for cross
cross-validation.
validation. Training data consists of source domain ratings and
portions of the target domain, marked in dark.
ItemRelTags Instead of tags assigned by the user, this method For
or hyperparameter optimization
optimization, with each method and sparsity
exploits all relevant tags applied by the whole user community,
community level in the target domain, we performed a grid (stepsize) search
and is thus able to compute rating predictionss even if the user has on the validat ion set for the values of the learning rate ,, the
validation
not tagged the target item. amount of regularization ,, and the number of latent ffeatures
eatures .
We evaluated all these recommendation methods in two settings, To get an idea of the typical values found for the parameters,
using MovieLens as source domain and LibraryThing as target Table 2 shows the average best values for each method
method.
domain
domain,, and vice
vice-versa.
versa. In both cases
cases, we evaluated the methods Table 2.. Ave
Average values of the best parameters found.
through 10-foldfold cross-validation,
ation, i.e.,
i.e. we shuffled the target
ML LT LT ML
ratings and split them into 10 nonnon-overlapping
overlapping folds. In each
ach fold
fold,
we left out one part,, 10% of the ratings, as test set to estimate the
performance of the methods.. The rest 90% of the ratings were MF 41 0.020
020 0.009 43 0.020 0.009
009
used as a training sett to learn the models, and a validation set to SVD++ 41 0.020
020 0.007 43 0.020 0.006
006
find the optimal values of the models’ parameters
parameters. Specifically, gSVD++ 43 0.019
019 0.001 43 0.020 0.004
004
we randomly chose 80% of these remaining ratings ratings, and combined UserItemTags 46 0.019
019 0.003 46 0.020 0.010
010
them with the source domain ratings to build the models. The final UserItemRelTags 39 0.017
17 0.008 41 0.020 0.017
017
20% left was uusedsed for the validation set to select the best number
ItemRelTags 40 0.017
017 0.001 46 0.020 0.006
006
of factors ,, learning rate ,, and regularization . Figure 1 depicts
the split of the data into training, validation, and test sets. TagGSVD++ 40 0.013
013 0.036 46 0.019 0.045
045
As in [6],, we also wanted to investigate how the number of From the table
table, we observe that there is not a large difference in
available ratin
ratings
gs in the target domain affects the quality of the the optimal number of factors and learning rate rates between
recommendations. For such purpose, we further split the training configuration
configurations. In con
contrast,, we note that the amount of
data from the target domain in into 10 portions to simulate diff
different regularization needed for the proposed TagGSVD++ method is
rating sparsity levels. First,, in order to evaluate the performance relatively large,, e.g. compare = 0.036 of TagGSVD++ with
of the methods in cold-startstart situations
situations, we used only 10% of the = 0.009 of MFMF. This may be due to the additional set of latent
i.e., 0.1 0.8 0.9 24,,564 = 1,768 ratings
target training ratings, i.e. variables for tags that our model uses; moreore complex models are
(see Table 1).. Then
Then, we incrementally added additional 10% of able to account for greater variance in the data and tend to overfit
the ratings to analyze the behavior of the methods with an more easily, thus requiring more regularization. In order to
increasingly larger amount of observed rating data. Inn each analyze how the available information in the target domain affects
sparsity level,, the full set of source domain ratings was also used the stability of the model, Figure 2 shows the optimal value for
to build the models.
model the regularization parameter for different sparsity levels.
Since all the methods are designed for the rating prediction task,
we measure
measured their performance as the accuracy of the estimated
ratings. Specifically, we computed the Mean Absolute Error
(MAE) of each model in the different settings described above:
1
= | − ̂ |
|ℛ |
,∈ℛ
where ℛ contains the ratings in the test set we left out for
evaluation.
5.3 Results
As previously mentioned, we reserved 20% of the target domain
training data in each fold for validating the models and finding the
best model parameters, in order not to overestimate the Figure 2.. Optimal values for the regularization parameter
performance of the methods. using MovieLens as source domain and LibraryThing as
target domain.
38
(a) (b)
Figure 3.. Average prediction error over the 10 folds for different amounts of observed ratings in the target domain. The striped
area represents the range of values within two standard deviations from the mean. (a) Results using LibraryThing as source
domain and MovieLens as target domain.. (b) Results using MovieLens as source domain,
domain and LibraryThing as target domain.
We note that the gSVD++, upon which our model is defined, also Only
nly 13.81% of the tags in MovieLens are shared in LibraryThing
introduces additional latent variables and yet requires a lower (see Table 1),
), and thus less latent tag factors learned in the source
regularization. We argue that the difference
differencess between gSVD++ domain can be used in the target to compute rating predictions.
regularizations are caused by the
and TagGSVD++ regularization Figure 4 shows the average rating prediction error for users with
and sets,, see equations (6) with = and (11). In Table 1 different amounts of observed ratings and tag assignments
assignments,, using
w
we see that, on average, the number of tags applied by a user is LibraryThing as source domain and MovieLens as target domain
domain.
much larger than the number of rated items. This results in more We see that our model achieves the best improvement
improvements in cold
cold-
variables actually taking part in the rating predictions, and hence start situations,
situations where few ratings and tag assignments are
in a more complex model that requires more re regularization
gularization to available in the target domain. We also note that the performance
prevent overfitting. degrades for users with more than 20 ratings (respectively
(respectively, 100
Once we found the best parameters for each method and sparsity tag assignments), when enough target domain data is available.
level, we ra
rann the models separately in the test set of each fold. The Nonetheless, in these cases, TagGSVD++ is still able to exp exploit
loit
final performance was estimated as the average MAE over the 10 the learned tag factors to compute more accurate predictions.
folds. Figure 33aa shows the results obtained using LibraryThing as
source domain and MovieLens as target domain. All the 6. CONCLUSIONS AND FUTURE WORK
differences with respect to the TagGSVD++ algorithm are Cross-domain recommendation has potential benefits over
Cross
statistically significant as determined with a Wilcoxon signed rank traditional recommender systems that focus on single domain domains,
test at the 95% confidence level. It can be seeseen that the proposed such as alleviating rating sparsity in a ta
target
rget domain by exploiting
TagGSVD++ method is able to consistently outperform the rest of data from a related source domain, improving the quality of
the methods for all sparsity levels in the target domain, also in the recommendations in coldcold-start
start situations by inferring new user
cold
cold-start
start setting when only 10%
10%-20%
20% of the ratings are available. preferences from other domains, and by personalizing cross cross-
We also note that cross-domain
domain methods always achieve better selling strategies to provide customers with sug suggestions
gestions of items
accuracy than single-domain MF, although SVD++ effectively of different types.
exploits implicit feedback and remains competitive until the 50%
Despite these advantages, cross
cross-domain
domain recommendation is a
sparsity level. Then, as the sparsity decreases, cross cross-domain
domain
fairly new topic with plenty of research opportunities to explore.
model
modelss provide greater improveme
improvements.. This indicates that even if
One of the major difficulties that arises is how to link or relate the
plenty of target domain rating data is available, it is still beneficial
different domains to support the transfer of knowledge. Due to the
to transfer knowledge from the source domain.
common heterogeneity of item attributes across domains,
The results using MovieLens as source domain and LibraryThing collaborative filtering techniques have become more popular than
as target domain are shown in Figure 3b. As before
before,, the difference content based methods. However, recent work [6][17] has
content-based
in MAE between TagGSVD++ and the rest of the methods is conclude that more reliable and meaningful relations can be
concluded
statistically significant, according to the Wilcoxon signed rank established between the domains by exploiting certain content
test with 95% confidence level. Again, TagGSVD++ is the best information, such as social tags.
performing method for all rating sparsity levels, followed by the
cross
cross-domain
domain methods. We now observe that the values of MAE In this paper, we have adapted a novel extension of the well well-
are in general larger than in the previous case, which seems to known SVD++ algorithm to separately model the effect of user
indicate that the transfer of knowledge is not as effective in this and item tags in the observed ratings. By introducing a new set of
setting. This observation is in accordance with the results reported latent variables that represent tags in the user profile, our
in [6], where the authors argue that this may be caused by TagGSVD++ method is able to transfer knowledge from a source
difference
differences in the ratio of overlapping tags between the domains
domains. domain more effec
effectively, providing accurate rating predictions
redictions in
39
Figure 4.. Average rating prediction error for users with different amounts of observed ratings (left) and tag assignments (right),
using LibraryThin
LibraryThingg as source domain and MovieLens as target domain.
the target domain
domain, even in cold cold-start
start situations. From our State of the Art. In Proceedings
Proceedin of the 2nd Spanish Conference
experiments in the movie
moviess and books domains, w wee conclude that Retrieval,, pp. 187
on Information Retrieval 187-198.
exploiting additional tag factors
factors, and decoupling user and item [8] Funk, S. 2006. Netflix Update: pdate: Try This At Home.
components in the fac
factorization
torization process improves the transfer of http://sifter.org/~simon/journal/20061211.html
knowledge and the accuracy of the recommendations. [9] Gao, S., Luo, H., Chen, D., Li, S., Gallinari, P., Guo, J. 2013.
Cross-Domain
Domain Recommenda
Recommendation tion via Cluster
Cluster-Level
Level Latent
In the future, we plan to further investigate the effect of tags in the
Factor Model. In Proceedings of the 17th and 24th European
quality of recommendations. In particular, we want to study how
Conference on Machine Learning and Knowledge Discovery
the recommendation performancermance depends on the number of
Databases pp. 161-176.
in Databases,
shared tags between domains. Increasing the overlap by grouping
[10] Garcia
rcia Manzato, M. 2013. gSVD++: Supporting IImplicit mplicit
tags with similar semantics but expressed differentdifferently in the
Feedback
eedback on Recommender
ecommender Systems ystems with Metadata etadata
domains could favor the transfer of knowledge.
Awareness.
wareness. In Proceedings of the 28th Annual ACM
IInn our experiments we altered the amount of observe
observed rating data Computing, pp. 908-913.
Symposium on Applied Computing 913.
in the target domain
domain,, but it would also be interesting to evaluate [11] Koren, Y. 2008. Factorization Meets eets the Neighborhood:
eighborhood: A
the methods varying the number of available ratings in the source Multifaceted
ultifaceted Collaborative Filtering
F Model.
odel. In Proceedings of
domain
domain. Moreover
Moreover, we will perform a more exhaustive evaluation the 14th ACM SIGKDD International nal Conference
onference on
with other datasets including more cross-domai
domainn recommendation Knowledge D Discovery and D Data Mining,, pp. 426426-434.
methods from the state of the art
art, such as [17]
[17]. [12] Koren, Y., Bell, R., Volinsky, C. 2009. Matrix Factorization
Techniques for Recommender Systems. IEEE Computer 42(8),
7. ACKNOWLEDGEMENTS pp. 30-37.
This work was supported by the Spanish Government ((TIN2013
TIN2013- [13] Koren, Y.,, Bell
Bell, R. 2011. Advances in Collaborative FilterFiltering.
ing.
47090
47090-C3-2). Handbook, pp. 145-186
Recommender Systems Handbook 186.
[14] Li, B., Yang, Q., Xue, X. 2009. Can Movies and Books
8. REFERENCES Collaborate? Cross
Cross-domain
domain Collaborative Filtering for Sparsity
[1] Abel, F., Helder, E., Houben, G. G.-J.,
J., Henze, N., Krause, D. Reduction. In Proceedings of the 21st 1st International Joint
2013.
13. Cross
Cross-system
system User Modeling and Personalization on the Intelligence,, pp. 2052-2057.
Conference on Artificial Intellige 2052
Social Web. User Modeling and User User--Adapted
Adapted Interaction [15] Pan, W., Xiang, E.W., Liu, N.N., Yang, Q. 2010. Transfer
23(2-3),
3), pp. 169
169-209. Learning in Collaborative Filtering for Sparsity Reduction. In
[2] Adomavicius, G., Tuzhilin, A. 2005. Toward the Next Proceedings of the 24th AAAI Conference on Artificial
Generation of Recommender Systems: A Survey of the State State- Intelligence,, pp. 210
210-235.
of-the-art
art aand
nd Possible E Extensions. IEEE Transactions on [16] Shapira, B., Rokach, L., Freilikh
Freilikhman,
man, S. 2013. Facebook
Knowledge and Data Engineering 17, pp. 734 734–749. Single and Cross Domain Data for Recommendation Systems.
[3] Berkovsky, S., Kuflik, T., Ricci, F. 2008. Mediation of User User Adapted Interaction 23(2-3),
User Modeling and User-Adapted 3), pp.
Models for Enhanced Personalization in Recommender 211-247.
Systems. User Modeling and User Adapted Interaction 18(3),
User-Adapted [17] Shi, Y., Larson, M., Hanjalic, A. 2011. Tags as Bridges
pp. 245-286.
286. between Domains: Improving Recommendation with Tag Tag-
[4] Cao, B., Liu, N. N., Yang, Q. 2010. Transfer Learning for induced
ced Cross
Cross-domain
domain Collaborative Filtering. In Proceedings
Collective Link Prediction in Multiple Heterogeneous of the 19th International Conference on User Modeling,
Domains. In Proceedings of the 27th International Conference Personalization, pp. 305-316.
Adaption, and Personalization 316.
Learning,, pp. 159
on Machine Learning 159-166. [18] Tiroshi, A., Berkovsky, S., Kaafar, M. A., Chen, T., Kuflik, T.
[5] Cremonesi, P., Tripodi, A., T Turrin,
urrin, R. 2011. Cross-domain
domain 2013. Cross Social Networks InterestInterestss Predictions Based on
Recommender Systems. In Proceedings of the 11th IEEE Graph Features. In Proceedings of the 7th ACM Conference on
Workshops,, pp. 496
International Conference on Data Mining Workshops 496- Systems,, pp. 319-322.
Recommender Systems 319
503. [19] Winoto, P., Tang, T. 2008. If You Like the Devil Wears Prada
[6] Enrich, M., Braunhofer, M., Ricci, F. 2013. Cold Cold-Start the Book, Will You also Enjoy the Devil Wears Prada the
Management with Cross
Cross--Domain
Domain Collaborative Filtering an and Movie? A Study of Cros Cross--Domain
Domain Recommendations. New
Tags. In Proceedings of the 14th International Conference on Generation Computing 26, pp. 209 209-225.
Commerce and Web Technologies, pp. 101
E-Commerce 101-112. [20] Zhang, Y., Cao, B., Yeung, D. D.-Y.
Y. 2010. Multi-Domain
Multi Domain
[7] Fernández--Tobías,
Tobías, I., Cantador, I., Kaminskas, M., Ricci, F. Collaborative Filtering. In Proceedings of the 26th Conference
2012. Cross
Cross-domain
domain Recommender Systems: A Survey of the Intelligence,, pp. 725
on Uncertainty in Artificial Intelligence 725–732.
40