=Paper= {{Paper |id=Vol-1892/paper5 |storemode=property |title=Matrix Factorization for Package Recommendations |pdfUrl=https://ceur-ws.org/Vol-1892/paper5.pdf |volume=Vol-1892 |authors=Agung Toto Wibowo,Advaith Siddharthan,Chenghua Lin,Judith Masthoff |dblpUrl=https://dblp.org/rec/conf/recsys/WibowoSLM17 }} ==Matrix Factorization for Package Recommendations== https://ceur-ws.org/Vol-1892/paper5.pdf
                 Matrix Factorization for Package Recommendations
                           Agung Toto Wibowo                                                             Advaith Siddharthan
            Computing Science / Informatics Engineering                                                Knowledge Media Institute
             University of Aberdeen / Telkom University                                                    The Open University
                    wibowo.agung@abdn.ac.uk /                                                        advaith.siddharthan@open.ac.uk
                 agungtoto@telkomuniversity.ac.id

                                 Chenghua Lin                                                                Judith Masthoff
                              Computing Science                                                             Computing Science
                            University of Aberdeen                                                        University of Aberdeen
                           chenghua.lin@abdn.ac.uk                                                        j.masthoff@abdn.ac.uk

ABSTRACT                                                                                   Outside of travel/tourism there are several other domains, such
Research in recommendation systems has to date focused on rec-                          as food (e.g. recommending a starter and main course), furniture
ommending individual items to users. However there are contexts                         [25] and clothing (e.g. recommending a shirt and trousers), which
in which combinations of items need to be recommended, and there                        offer good opportunities for recommending packages.
has been less research to date on how collaborative methods such                           In the clothes domain, there are some package recommendation
as matrix factorization can be applied to such tasks. The research                      approaches based on image features [7, 17]. These approaches col-
contributions of this paper are threefold. First, we formalize the                      lected images (each image containing both top and bottom) from
collaborative package recommendation task as an extension of the                        fashion websites [17] or fashion magazines [7] to create a package
standard collaborative recommendation task. Second, we describe                         reference database. Using image processing techniques, they au-
and make available a novel package recommendation dataset in                            tomatically separated top and bottom. Miura et al. [17] extracted
the clothes domain, where a combination of a “top” (e.g. a shirt,                       image features (such as RGB histogram and scale invariant features
t-shirt or top) and “bottom” (e.g. trousers, shorts or skirts) needs to                 transform [SIFT] [14] values) for both top and bottom. To pro-
be recommended. Finally, we describe several extensions of matrix                       vide package recommendations, they required the user to provide
factorization to predict user ratings on packages, and report RMSE                      a query (top or bottom) image. This image was then compared
improvements over the standard matrix factorization approach for                        with packages in the reference database, and the closest package
recommending combinations of tops and bottoms.                                          reference returned as a recommendation. Similar to Miura’s work,
                                                                                        Iwata et al. [7] extracted visual features (such as colour, texture
KEYWORDS                                                                                and SIFT as a bag-of-features, and derived a topic model over these
                                                                                        using Latent Dirichlet Allocation (LDA). When a user provided a
Package Recommendation, Matrix Factorization, Clothes Domain,
                                                                                        query image (top/bottom), Iwata et al. recommended the other part
Collaborative Filtering
                                                                                        by searching the topic model in their package reference database.
                                                                                           Shen et al. [19] developed a clothes package recommendation
1    INTRODUCTION                                                                       system based on user context. First, they stored clothing items
Recent research into recommendation systems has focused on meth-                        and combinations of items in a user wardrobe database. They also
ods for Collaborative Filtering (CF) [5, 20] for tasks such as rec-                     annotated its contents using English words. To generate recommen-
ommending individual or top-N items to users [8] and for making                         dations, their system asked the user about their goals (“destinations”
cross-domain recommendations [3, 12, 18].                                               and “want to look like”) and mapped them to possible characteristic
   There has been less research into package recommendations,                           of clothes in the user wardrobe.
where a combination of items needs to be recommended together.                             With respect to recommendations, all these methods have the
Travel is one domain that is mentioned in the literature [6, 13],                       following drawbacks: (a) they work from a fixed reference data-
where a travel package could consist of a set of destinations and is                    base, with no flexibility for recommending combinations not in
often recommended to a group of users. For example, in a travel                         the database; (b) the recommendations provided from the database
planning task, a user (or group) is recommended a package of places                     are not tailored to user preferences (though Shen et al. allow the
of interest (POI) which satisfy some constraints such as budget or                      user to specify some aspects of the style); and (c) The methods are
time [22, 23]. Such travel recommender systems need to be able                          highly tailored to the clothes domain and cannot be readily applied
to handle constraints, e.g. “no more than 3 museums” or “travel                         to package recommendations in other domains.
distance is less than 10 km”. Another task is to provide alternatives                      To overcome these drawbacks, we formalize package recommen-
for restaurants, transportation and hotels as POI [1].                                  dations as a collaborative filtering task and argue that collaborative
                                                                                        package recommendation is an interesting task for three reasons.
ComplexRec 2017, Como, Italy.                                                           First, collaborative package recommendation is more challenging
2017. Copyright for the individual papers remains with the authors. Copying permitted   than item recommendation since people might dislike a package,
for private and academic purposes. This volume is published and copyrighted by its
editors. Published on CEUR-WS, Volume 1892..                                            even if they like the individual items. Such preferences can reflect
ComplexRec 2017, August 31, 2017, Como, Italy.                                                                                                                    Wibowo et al.

                       Table 1: Mathematical Notations                                previous ratings by any user to any item. In this paper, we introduce
                                                                                      a collaborative filtering task for package recommendations, where
          Notations                    Descriptions                                   we need to predict ratings given by a user to combinations of
                                                                                      items, based on a set of previous ratings by any user to any item or
                  m                    number of users
                                                                                      combination of items.
                   o                   number of “top” items
                                                                                          In this paper, we discuss package recommendation for the clothes
                   p                   number of “bottom” items
                                                                                      domain. Consider a set of clothes I a = {i 1t , i 2t , . . . , iot , i 1b , i 2b , . . . , ipb },
                 ux                    a user u with id x
                                                                                      consisting of two disjoint complementary sets: a set of o top items
                  iyc                  an item i with id y from category c
                                                                                      I t = {i 1t , i 2t , . . . , iot } and a set of p bottom items I b = {i 1b , i 2b , . . . , ipb },
             (i x , iyb )
                t                      a package of items consist of
                                                                                      where I t ∪ I b = I a ; o + p = n. Some of these items and their
                                       a top i xt , and a bottom iyb
                                                                                      combinations (a package) have received ratings from one or more
 U = {u 1 , u 2 , · · · , um }         a set of users
                                                                                      of m possible users U = {u 1 , u 2 , ..., um }. In our notation, individual
  I t = {i 1t , i 2t , · · · , iot }   a set of “top” items
                                                                                      ratings are denoted as a triplet (u, i, ru,i ), where u ∈ U , i ∈ I a
 I b = {i 1b , i 2b , · · · , ipb }    a set of “bottom” items
                                                                                      and ru,i is the rating given by user u to item i. Package ratings are
       Ia = It ∪ Ib                    a set of all individual items                  denoted as a quadruple (u, i t , i b , ru,(i t ,i b ) ), where u ∈ U , i t ∈ I t ,
        (u, i, ru,i )                  individual rating in triplet format
                                       from user u to item i                          i b ∈ I b , and ru,(i t ,i b ) is the rating provided by user u to the package
               rˆu,i                   predicted rating from user u                   (i t , i b ). Our task is to predict the unobserved package ratings for
                                       to item i                                      a user from an observed set of ratings for individual items and
   (u, i t , i b , ru,(i t ,i b ) )    a package rating in quadruple format           packages by this and other users. This definition is easily extended
                                       from user u to a combination of (i t , i b )   to other domains and to tasks which might involve more than two
                Vt                     matrix of individual “top” ratings,            items within a package.
                                       where |rows | = m (no. of users);
                                                 |columns | = o (no. of tops)         2.2       Dataset Generation
                Vb                     matrix of individual “bottom” ratings,         A bottleneck to research on package recommendations is the lack of
                                       where |rows | = m ;                            open datasets suited for this task. To overcome this, we generated a
                                                 |columns | = p (no. of bottoms)      dataset by randomly selecting 1,400 “top” and 600 “bottom” images
                Va                     matrix of individual ratings                   from Amazon product data [15, 16] and obtaining 30 ratings each
                                       for top and bottom,                            from 200 participants recruited from Amazon Mechanical Turk for
                                       where |rows | = m; |columns | = o + p,         individual tops and bottoms and packages combining them.
                Vp                     matrix of package ratings,                        For each participant, we first asked them whether they wear
                                       where |rows | = m; |columns | = o × p          clothes for men or women, and then provided 30 screens where
                                                                                      each screen showed images of one top and one bottom filtered for
                                                                                      their chosen gender preference. We also asked participants to rate
individual taste and style, and recommendations therefore need                        on a scale of 1 to 5 how much:
to be personalized to users. Second, package recommendations                                (1) they would like to wear the top,
face greater data sparsity issues compared to the collaborative item                        (2) they would like to wear the trousers,
recommendation task. The number of possible combinations is                                 (3) they would like to wear the top and trousers together.
large and for the same number of user ratings, the package recom-                        An example can be seen in Figure 1. From our participants, we
mendation matrix is much sparser than for item recommendation.                        obtained 12,000 individual ratings and 6,000 package ratings. We
Third, unlike the previous work described above, we can easily                        have made this dataset freely downloadable from the PackageRec-
extend our package recommendation approach to other domains                           Dataset Github repository1 .
(such as food, etc.) by formulating package recommendation as                            The distribution of ratings for our data set are shown in Ta-
collaborative filtering.                                                              ble 2. Note that the percentage of highly rated packages is much
   The remainder of this paper is organized as follows. Section 2                     lower than that of either tops or bottoms, which makes package
defines the package recommendation task and the notation used,                        recommendation a more challenging task.
describes how the dataset was generated, and formulates several
Matrix Factorization approaches for package recommendation. Sec-
                                                                                      2.3       Matrix Factorization Methods
tion 3 details our experimental settings, Section 4 reports our ex-
periment results, and Section 5 provides a discussion and suggests                       2.3.1 Matrix Factorization for Item Recommendations. In col-
directions for future work.                                                           laborative filtering, there are many approaches to provide rating
                                                                                      predictions. Some approaches calculate similarity between users
2 PACKAGE RECOMMENDATIONS                                                             or items [5, 20], while other approaches use matrix factorization
                                                                                      techniques to decompose the rating matrix into two (or more) ma-
2.1 Definition                                                                        trices. The first winner [9] of the Netflix prize reported that matrix
The traditional collaborative filtering (CF) [5, 20] task is defined
as predicting the ratings given by users to items, based on a set of                  1 https://github.com/atwRecsys/PackageRecDataset
Matrix Factorization for Package Recommendations                                                    ComplexRec 2017, August 31, 2017, Como, Italy.


                                                                                      and packages [o × p]) as prediction. We used this scenario
                                                                                      as a baseline.
                                                                                  (2) In our second scenario (MF-Package), we used V p as input
                                                                                      and ran MF over this package rating matrix. Using this sce-
                                                                                      nario, we obtained two latent matrices W p and H p , which
                                                                                      when multiplied together provided ratings for missing cells.
                                                                                      This is our second baseline.
                                                                                  (3) In our third scenario (MF-Pseudo), to address the matrix
                                                                                      sparsity issue in the baseline above, we first populated V p
                                                                                      by adding some pseudo-ratings (r t b ) into V p , before
                                                                                                                         0                   0
                                                                                                                        u,(i ,i )
                                                                                      then applying MF to the matrix. Starting from a rating
                                                                                      by a user for a package, we identified similar packages
                                                                                      involving a new item (either top or bottom) where the
                                                                                      cosine similarity between the new and known item was
                                                                                      more than specified threshold.
                                                                                          Consider a known package rating (u, i xt , iyb , ru,(i t ,i b ) ).
                                                                                                                                                                 x   y
                                                                                      For each top item i zt in the matrix, we added a package pseu-
                                                                                      dorating r 0 t b where r 0 t b = ru,(i t ,i b ) if cossim(i xt ,
         Figure 1: Example of Clothes Questionnaire                                                 u,(i z ,iy )     u,(i z ,iy )          x       y

                                                                                      i zt ) ≥ θ . Likewise, for each bottom item i sb we added a pack-
                  Table 2: Rating Distributions                                       age pseudorating (u, i xt , i sb , r 0 t b ), where r 0 t b =
                                                                                                                         u,(i x ,i s )                 u,(i x ,i s )
                                                                                      ru,(i t ,i b ) if cossim(iyb , i sb ) ≥ θ . After we added these pseu-
                                                                                           x   y
                                    Rating Frequency
                                                                                      doratings, we ran MF and obtained W p and H p . The
                                                                                                                                               0             0
                            1         2       3      4           5
                                                                                      package rating predictions are generated by multiplying
             Tops       1,710   1,080       1,169     1,167     874                   these matrices.
           Bottoms      1,574     958       1,185     1,282   1,001
                                                                                (4,5) In our fourth and fifth scenarios (MF-Min-Cat and MF-Mul-
           Packages     2564    1216        1060        760     400
                                                                                      Cat), we ran MF individually over the user–top (V t ) and
                                                                                      user–bottom (V b ) matrices. From V t we obtained W t and
factorization has many benefits for overcoming common problems                        H t , and from V b we obtained W b and H b . The package
in recommender systems such as data sparsity and cold start [24].                     rating predictions rˆu,(i t ,i b ) were obtained in two ways: (a)
   Matrix factorization                                                               MF-Min-Cat predicted package ratings using the minimum
                     (MF)   can be defined as
                                             producing two factor
                                                                                      value of rˆu,i t and rˆu,i b ; (b) MF-Mul-Cat predicted pack-
matrices, say W = w i j ∈ Rm×k and H = hi j ∈ Rk ×n from one
                                              

known matrix V = vi j ∈ Rm×n , so the product of W and H are
                                                                                    age ratings using the harmonic mean of rˆu,i t and rˆu,i b
(approximately) equal to V :                                                          (Equation 3).
                                                                                                                                    a ×b
                                V ≈WH ,                               (1)                          harmonic mean(a, b) = 1                                             (3)
                                                                                                                               2 (a + b)
where each cell is computed as:
                                                                                (6,7) In our sixth and seventh scenarios (MF-Min-All and MF-
                                    k
                                    Õ                                                 Mul-All), we ran MF over all individual rating matrix (V a ).
                          v̂ xy ≈         w x i hiy                   (2)
                                                                                      From this process, we obtained W a and H a . The package
                                    i=1
                                                                                      rating predictions rˆu,(i t ,i b ) were obtained in two ways: (a)
   There are many algorithms for MF, such as Multiplicative [10, 11],
                                                                                      MF-Min-All predicted package rating using the minimum
Gradient descent [2, 4], Alternating Least Square [2, 4], and more.
                                                                                      value of rˆu,i t and rˆu,i b ; (b) MF-Mul-All predicted ratings
These algorithms aim to minimize the difference between the known
                                                                                      using the harmonic mean of rˆu,i t and rˆu,i b (Equation 3).
values in matrix V and the corresponding values in its multiplicative
form W H (the cost function) through an iterative process. When                 To summarise, scenaro 1 applies Average Predictor baseline over
the factors W and H are computed in this manner, it has been found           V p , which takes the average rating for each item; scenario 2 applies
that the product W H provides values for missing cells in V , and            MF over the user–package matrix; and scenarios 3–7 apply MF to
that these turn out to be good estimates of these missing ratings.           the user–item matrices and combine predicted ratings of items in
                                                                             the package using either a minimum or a harmonic mean function.
   2.3.2 Extending MF for Package Recommendations. From the
definitions in Table 1, there are four different matrices that can
                                                                             3 EXPERIMENTAL SETTINGS
be input to matrix factorization methods (V t , V b , V a , V p ). In this
paper, we utilized these inputs in seven different scenarios:                3.1 Crossvalidation Method
     (1) In our first scenario (Average Predictor), we used the aver-        Given ru,i t as a “top” rating, ru,i b as a “bottom” rating and ru,(i t ,i b )
         age value of each package in V p (the matrix of user [m]            as “package” rating, there are six possibilities:
ComplexRec 2017, August 31, 2017, Como, Italy.                                                                                                  Wibowo et al.


      (1) We do not know ru,(i t ,i b ) , but we know ru,i t or ru,i b ;               Table 3: Average Testing set RMSE Performance
      (2) We do not know ru,(i t ,i b ) , but we know ru,i t and ru,i b ;
      (3) We know ru,(i t ,i b ) , but we only know one of ru,i t and                                                    Average Package Rating RMSE
          ru,i b ;                                                                 Scenario                   All        1       2      3       4    5
      (4) We know ru,(i t ,i b ) , but do not know either ru,i t or ru,i b ;       Average Predictor       1.234     1.146    0.200    0.770    1.715     2.670
      (5) We know ru,i t , ru,i b , and ru,(i t ,i b ) .                           MF-Package(Base)        1.435     1.396    0.949    0.974    1.607     2.372
      (6) We do not know ru,i t , ru,i b , or ru,(i t ,i b ) .                     MF-Pseudo(θ = 0.5)      1.296     1.137    0.825    1.060    1.755     2.429
                                                                                   MF-Pseudo(θ = 0.7)      1.330     1.166    0.841    1.076    1.762     2.481
Our dataset collected ratings for “top”, “bottom” and “package”
                                                                                   MF-Pseudo(θ = 0.9)      1.395     1.319    0.922    1.009    1.637     2.403
together; thus for any user only (5–6) are possible. However (1–4)                 MF-Min-Cat              1.154     1.242    0.715    0.734    1.338     1.893
are realistic scenarios for a package recommendation system. To                    MF-Mul-Cat              1.218     1.485    0.891    0.601    1.067     1.591
cover all these possibilities, we adopted the following methodology.               MF-Min-All              1.166     1.327    0.776    0.686    1.233     1.766
First, we used 4-fold crossvalidation by randomly splitting the                    MF-Mul-All              1.237     1.537    0.940    0.599    1.013     1.515
individual ratings into four parts. We rotated and used 3 parts as the
training set and one for testing. Then in each fold we used only 25%
of package ratings ru,(i t ,i b ) as the training set, and the remaining          The pseudo-ratings approach reducing sparsity in the package
75% package ratings ru,(i t ,i b ) as the test set. These mechanisms           matrix and the minimum function for combining item ratings per-
for holding back data to make package predictions ensure that all              formed better at predicting low ratings. MF-Pseudo and MF-Min-Cat
possibilities are covered in a realistic manner.                               outperform other scenarios for low ratings (marked as green cells
                                                                               in the “1” and “2” columns). MF-Pseudo increases matrix density
3.2     Experimental Settings                                                  by populated the package rating matrix with some pseudo-ratings
We used matrix factorization with gradient descent [21], with 100              based on similarity to known packages. Since low package ratings
iterations. In this experiment we varied k, the number of latent               are frequent, MF-Pseudo might get a stronger signal to predict low
dimensions in MF (k = 5, 10, 15, 20). We also varied the threshold             ratings rather than higher ones.
value for similarity when adding pseudo-transactions in MF-Pseudo                 For highly rated packages, on the other hand,the multiplica-
using values (θ = 0.5, 0.7, 0.9).                                              tive methods for combining individual ratings performed better.
   We report the average RMSE performance over the test sets in                MF-Mul-All outperformed other approaches for the high ratings
each fold, as defined in Section 3.1. In addition, we also report              (marked as green cells in the “3”, “4” and “5” columns). This is
the average RMSE performance for each known rating. As we can                  not unexpected, as when we combine two ratings for “top” and
see in Table 2, users tended to give low ratings for package rec-              “bottom”, the harmonic mean (MF-Mul-All) will by definition give a
ommendations, and such low-rated packages dominate the dataset.                slightly higher estimate than the minimum (MF-Min-All).
However, for a recommendation task, we are primarily concerned
with accurately predicting the highly rated packages. The table                5     CONCLUSIONS AND FUTURE WORK
thus allows for comparison of algorithms on the more realistic task.           We have defined a new task of collaborative package recommen-
   In real world situations, we are usually interested in providing            dation and made available the first public dataset for this task. We
top-N packages to users. This sort of evaluation is unfortunately not          have also suggested several adaptations of the standard matrix fac-
possible for our dataset, as mechanical turkers were given random              torization approach to item recommendation. All the adaptations
combinations to rate, and were not allowed to choose items or                  outperform the standard MF baseline, and different adaptations
packages they liked. Though out of scope for this paper, we would              demonstrated strengths in different situations.
in the future like to evaluate package recommendations using a                    Our work can be extended in a couple of ways. One is take into
rank performance metric in a real domain.                                      account item attributes (such as colour, dresscode, etc.), and user
                                                                               attributes (such as gender and age, etc) within a tensor factoriza-
4     RESULTS                                                                  tion framework. We would also like to extend our clothes package
Table 3 shows the average RMSE for the testing set for different               recommendations by adding other categories (such as accessories)
scenarios. The “All” column represent the overall RMSE, while the 1,           and also investigate the package recommendation methods in other
2, 3, 4, and 5 columns represent the average RMSE over the known               domains, such as food, where we might additionally consider con-
package ratings. For example, column “1” represent average RMSE                straints such as allergens and nutrition.
to ru,(i t ,i b ) = 1. The yellow cells in this table show our baseline
RMSE over the package testing set.                                             ACKNOWLEDGMENTS
   All our adaptations outperform the MF-Package baseline (lower               We would like to thank Lembaga Pengelola Dana Pendidikan (LPDP),
RMSE values in the “All” column), and many of them outperform                  Departemen Keuangan Indonesia for awarding a scholarship to sup-
the Average Predictor. MF-Min-Cat has the best overall performance             port the studies of the lead author.
(the green cell in the “All” column). In our dataset, people overall
gave lower rating for packages than for individual items. Estimating           REFERENCES
the package rating as the minimum of the individual item rating                 [1] Sihem Amer-Yahia, Francesco Bonchi, Carlos Castillo, Esteban Feuerstein, Isabel
                                                                                    Mendez-Diaz, and Paula Zabala. 2014. Composite retrieval of diverse and com-
predictions therefore gives better results overall, but increased                   plementary bundles. IEEE Transactions on Knowledge and Data Engineering 26,
errors for highly rated packages that we would want to recommend.                   11 (2014), 2662–2675.
Matrix Factorization for Package Recommendations                                                                  ComplexRec 2017, August 31, 2017, Como, Italy.


 [2] Michael W Berry, Murray Browne, Amy N Langville, V Paul Pauca, and Robert J                SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM,
     Plemmons. 2007. Algorithms and applications for approximate nonnegative                    785–794.
     matrix factorization. Computational statistics & data analysis 52, 1 (2007), 155–     [16] Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel.
     173.                                                                                       2015. Image-based recommendations on styles and substitutes. In Proceedings
 [3] Iván Cantador, Ignacio Fernández-Tobı́as, Shlomo Berkovsky, and Paolo Cre-               of the 38th International ACM SIGIR Conference on Research and Development in
     monesi. 2015. Cross-domain recommender systems. In Recommender Systems                     Information Retrieval. ACM, 43–52.
     Handbook. Springer, 919–959.                                                          [17] Shinya Miura, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2013. SNAPPER:
 [4] Moody Chu, Fasma Diele, Robert Plemmons, and Stefania Ragni. 2004. Optimality,             fashion coordinate image retrieval system. In Signal-Image Technology & Internet-
     computation, and interpretation of nonnegative matrix factorizations. In SIAM              Based Systems (SITIS), 2013 International Conference on. IEEE, 784–789.
     Journal on Matrix Analysis. Citeseer.                                                 [18] Shaghayegh Sahebi and Peter Brusilovsky. 2013. Cross-domain collaborative
 [5] Michael D Ekstrand, John T Riedl, and Joseph A Konstan. 2011. Collaborative                recommendation in a cold-start context: The impact of user profile size on
     filtering recommender systems. Foundations and Trends in Human-Computer                    the quality of recommendation. In International Conference on User Modeling,
     Interaction 4, 2 (2011), 81–173.                                                           Adaptation, and Personalization. Springer, 289–295.
 [6] A Felfernig, S Gordea, D Jannach, E Teppan, and M Zanker. 2007. A short survey        [19] Edward Shen, Henry Lieberman, and Francis Lam. 2007. What am I gonna
     of recommendation technologies in travel and tourism. OEGAI journal 25, 7                  wear?: scenario-oriented recommendation. In Proceedings of the 12th international
     (2007), 17–22.                                                                             conference on Intelligent user interfaces. ACM, 365–368.
 [7] Tomoharu Iwata, Shinji Wanatabe, and Hiroshi Sawada. 2011. Fashion coordi-            [20] Xiaoyuan Su and Taghi M Khoshgoftaar. 2009. A survey of collaborative filtering
     nates recommender system using photographs from fashion magazines. In IJCAI,               techniques. Advances in artificial intelligence 2009 (2009), 4.
     Vol. 22. Citeseer, 2262.                                                              [21] Gábor Takács, István Pilászy, Bottyán Németh, and Domonkos Tikk. 2008. Matrix
 [8] George Karypis. 2001. Evaluation of item-based top-n recommendation algo-                  Factorization and Neighbor Based Algorithms for the Netflix Prize Problem.
     rithms. In Proceedings of the tenth international conference on Information and            In Proceedings of the 2008 ACM Conference on Recommender Systems (RecSys
     knowledge management. ACM, 247–254.                                                        ’08). ACM, New York, NY, USA, 267–274. DOI:https://doi.org/10.1145/1454008.
 [9] Yehuda Koren. 2009. The Bellkor Solution to the Netflix Grand Prize. Netflix               1454049
     prize documentation 81 (2009).                                                        [22] Min Xie, Laks VS Lakshmanan, and Peter T Wood. 2010. Breaking out of the box
[10] D. D. Lee and H. S. Seung. 1999. Learning the parts of objects by non-negative             of recommendations: from items to packages. In Proceedings of the fourth ACM
     matrix factorization. Nature 401, 6755 (1999), 788–791.                                    conference on Recommender systems. ACM, 151–158.
[11] D. D. Lee and H. S. Seung. 2001. Algorithms for non-negative matrix factorization.    [23] Min Xie, Laks VS Lakshmanan, and Peter T Wood. 2011. Comprec-trip: A
[12] Bin Li, Qiang Yang, and Xiangyang Xue. 2009. Can Movies and Books Collab-                  composite recommendation system for travel planning. In Data Engineering
     orate? Cross-Domain Collaborative Filtering for Sparsity Reduction.. In IJCAI,             (ICDE), 2011 IEEE 27th International Conference on. IEEE, 1352–1355.
     Vol. 9. 2052–2057.                                                                    [24] Ke Zhou, Shuang-Hong Yang, and Hongyuan Zha. 2011. Functional matrix
[13] Qi Liu, Enhong Chen, Hui Xiong, Yong Ge, Zhongmou Li, and Xiang Wu. 2014.                  factorizations for cold-start recommendation. In Proceedings of the 34th interna-
     A cocktail approach for travel package recommendation. IEEE Transactions on                tional ACM SIGIR conference on Research and development in Information Retrieval.
     Knowledge and Data Engineering 26, 2 (2014), 278–293.                                      ACM, 315–324.
[14] David G Lowe. 1999. Object recognition from local scale-invariant features. In        [25] Tao Zhu, Patrick Harrington, Junjun Li, and Lei Tang. 2014. Bundle recommenda-
     Computer vision, 1999. The proceedings of the seventh IEEE international conference        tion in ecommerce. In Proceedings of the 37th international ACM SIGIR conference
     on, Vol. 2. Ieee, 1150–1157.                                                               on Research & development in information retrieval. ACM, 657–666.
[15] Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015. Inferring networks
     of substitutable and complementary products. In Proceedings of the 21th ACM