Style Recommendation for Fashion Items using Heterogeneous Information Network Hanbit Lee Sang-goo Lee School of Computer Science and Engineering School of Computer Science and Engineering Seoul National University Seoul National University Seoul, South Korea Seoul, South Korea skcheon@europa.snu.ac.kr sglee@europa.snu.ac.kr ABSTRACT In the midst of vast amounts of available fashion items, con- sumers today require more efficient recommendation ser- vices. A system that sorts out items that form a stylish ensemble with already selected or possessed items would provide them with greater convenience. In this paper, we propose a fashion item recommendation method that learns the way the fashion items are matched from a large ensemble database. We empirically show that the proposed method can explain factors that affect item matching and recom- mend the most suitable items to the given set of items. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval Keywords Style recommendation, Clothing ensemble recommendation, Figure 1: An example ensemble of fashion items Heterogeneous information network 1. INTRODUCTION 2. DATA COLLECTION Today, as massive amounts of fashion items are available We have collected 18,449 fashion items and 7,458 ensem- in both online and offline market, needs for efficient recom- bles from an online shopping mall. Each ensemble contains mendation services has grown significantly. One of the most about 2.5 items. The ensembles, which are presented by pro- important factors in recommending a fashion item is how fessional fashion coordinators of the shopping mall, consists well the item combines with a set of other items to form of clothes, shoes, and fashion accessories as shown in Fig 1. stylish ensemble. A number of works have been proposed in We extracted and refined 4 attributes - category, material, matching fashion items using web-scraped outfit combina- pattern, and color - from item descriptions and item im- tion dataset from sites such as Pinterest. However, they are ages. Table 1 shows value sets of each attribute. Weighted mostly based on color matching and are not flexible enough multi-color vectors are extracted from data images using a to exploit other relevant features[1, 2]. color extraction tool. The color vectors are then grouped In this paper, we propose a fashion item recommenda- into 3000 clusters using k-means clustering. tion method that learns from a large ensemble database. The items and their attributes, and the ensembles are mod- Table 1: 4 attributes and value set of each eled as a heterogeneous information network that allows for flexible semantic analysis. We define meta-paths on the net- Attribute Value Set work as patterns of relationships between items with respect Category Jacket, Suit, Coat, Shirts, T-Shirts, Sweater, to attributes and ensembles. Relative importance of each Cardigan, Vest, Jeans, Slacks, Cargo, Baggy meta-path in matching items is learned from the ensemble Pattern Striped, Checkered, Twisted, Printed, Dot- database. We show through experiments that our proposed ted, Floral, Camoflage, Paisley, Herringbone method outperforms baseline algorithms. Material Cotton, Leather, Denim, Wool, Linen, Suede, Corduroy, Fur, Spandex Color 3000 color clusters Copyright is held by the author(s). RecSys 2015 Poster Proceedings, September 16-20, 2015, Austria, Vienna. category Table 2: Important meta-paths and according co- efficients (c denotes category, p denotes pattern, l pattern denotes color, and e denotes ensemble) item ensemble No. Meta-path Coefficient material (a) i→c→i -6.897 color (b) i→p→i 1.090 (c) i→l→i -3.041 (d) i→c→i→e→i→c→i 2.088 Figure 2: Network schema for fashion item ensemble (e) i→p→i→e→i→p→i -0.607 (f) i→p→i→e→i→l→i 0.565 3. LEARNING PATH WEIGHTS (g) i→l→i→e→i→p→i 0.652 Fig 2 shows the network schema for fashion item ensemble (h) i→l→i→e→i→l→i 3.826 dataset. There are 6 types of nodes, namely, item, cate- gory, pattern, material, color, and ensemble. Unla- beled edges represent direct associations between the nodes. for category attribute ((a) & (d)), while those for the pat- We use the concept of meta-path[3] which can explain lever- tern attribute ((b) & (e)) turn out to be in the opposite. age factors related to clothing matching on the given net- Also, we can infer from (f) and (g) that pattern and color work. Two kinds of meta-paths are used: are tightly related in styling. item → X → item (1) item → X → item → ensemble → item → Y → item (2) 4. EVALUATION AND CONCLUSION The effectiveness of recommendation have been evaluated where X, Y ∈ {Category, P attern, M aterial, Color}, so the using the remaining 958 ensembles. As in the training stage, total of 4+16=20 meta-paths are used. (1) is used based one item per ensemble is chosen as the target item and the on intuition that the items which share the same attribute remaining used as query items. Items nearest to the query X would be matched together, and (2) is based on intu- items are recommended using the trained regression model. ition that the items with the attributes that are frequently Random selection (Random) and personalized pagerank (PPR) matched together on the network would be matched. For based recommendations are used as baseline methods. Ta- example with ”item → category → item → ensemble → ble 3 shows the results where performance is measured in item → category → item” path, an item in the ’Jeans’ terms of precision at k (k=1,3,5; P@1, ..., P@5) and mean category would be matched with an item in the ’T-Shirts’ reciprocal rank (MRR). The performance of PPR is lower category, if the ’T-Shirts’ category contains a lot of items than Random since PPR assigns higher scores to items near that have been matched to ’Jeans’ items. the query items. Consequently, the items of the same cate- To learn the coefficients of each meta-path, we sample gory or color with the query items tend to be recommended. 2,000 ensembles among 6,500 training ensembles (the rest Meanwhile, the meta-path based recommendation exploits is used for evaluation). Then for each sampled ensemble, the learned weights of the meta-paths, resulting in more ef- we randomly choose one item as the target item and use fective recommendation. the rest as query items. We choose to use normalized path count(NPC)[4] as path-based feature and prepare 20 dimen- Table 3: Result of each recommendation method sional feature vector for each ensemble as follows: Method P@1 P@3 P@5 MRR fQ,c = (N P Cp1 (Q, c), N P Cp2 (Q, c), ..., N P Cp2 0 (Q, c)) Random 0.0715 - - - X PPR 0.0643 0.0602 0.0459 0.1983 where N P Cpi (Q, c) = N P Cpi (q, c)/|Q| Path 0.4004 0.2193 0.1621 0.5716 q∈Q where NPCpi (q, c) is normalized path count between q and 5. ACKNOWLEDGEMENTS c along meta-path pi , Q is the set of query items, and c is the candidate item. The candidate items are sampled from This work was supported by the National Research Foun- the items that are released in the same month as the target dation of Korea(NRF) grant funded by the Korea Govern- item. And the according label becomes: ment(MSIP) (No. 20110030812). ( lQ,c = 1, if c = target item 6. REFERENCES 0, otherwise [1] Manasi Vartak, Samuel Madden. Chic: a combination The coefficient of each meta-path is learned using logistic based recommendation system. SIGMOD, 2013. [2] Qingqing. Tu and Le.Dong. An intelligent personalized regression on the feature vector and label pairs, (fQ,c , lQ,c ). fashion recommendation system. ICCCAS, 2010. Table 2 shows the important meta-paths and correspond- [3] Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, Tianyi ing coefficients. Negative coefficient of (a) means the items Wu. PathSim: Meta PathBased TopK Similarity Search in that belong to the same category are rarely matched, which Heterogeneous Information Networks. VLDB, 2011. is trivial. In case of (d), the positive coefficient indicates [4] Yizhou Sun, Rick Barber, Manish Gupta, Charu C. that categories matched frequently on the network are actu- Aggarwal, Jiawei Han. Co-author Relationship Prediction ally important in item matching. Meta-paths for color at- in Heterogeneous Bibliographic Networks. ASONAM, 2011. tribute ((c) & (h)) show similar result with the meta-paths