=Paper= {{Paper |id=Vol-1892/paper1 |storemode=property |title=Body Measure-aware Fashion Product Recommendations: Evaluating the Predictive Power of Body Scan Data |pdfUrl=https://ceur-ws.org/Vol-1892/paper1.pdf |volume=Vol-1892 |authors=Alexander Piazza,Jochen Süßmuth,Freimut Bodendorf |dblpUrl=https://dblp.org/rec/conf/recsys/PiazzaSB17 }} ==Body Measure-aware Fashion Product Recommendations: Evaluating the Predictive Power of Body Scan Data== https://ceur-ws.org/Vol-1892/paper1.pdf
           Body measure-aware fashion product recommendations:
             evaluating the predictive power of body scan data
                Alexander Piazza                                           Jochen Süßmuth                             Freimut Bodendorf
      University of Erlangen-Nuremberg                          University of Erlangen-Nuremberg                University of Erlangen-Nuremberg
       Institute of Information Systems                           Chair of Computer Graphics                     Institute of Information Systems
                Lange Gasse 20                                            Cauerstraße 11                                  Lange Gasse 20
               Nuremberg 90403                                           Erlangen 91058                                  Nuremberg 90403
           alexander.piazza@fau.de                                  jochen.suessmuth@fau.de                         freimut.bodendorf@fau.de

ABSTRACT                                                                                can make informed decisions [11]. Especially fashion purchase
Fashion product consumer are faced with large and fast changing                         decisions are complex, as multiple aspects of the user are influecing
product offerings. The fashion purchase decision process is complex,                    factors like the personality, emotion, or general fashion trends as
as the consumer has to consider various influencing factors like cur-                   well as their physical appearance like the height [9], skin color,
rent fashion trends, what fashion products fit to their personality,                    or body type [12]. The user has to decide not only based on what
and what products fit to their physical appearance like hair colors                     fashion products she or he in general likes, but also e.g., what col-
or body measures. Based on novel technologies, 3D body avatars                          ors fit his or her hair and skin color, and what cut of the clothes
can be reconstructed from 3D or 2D data. From these avatars, body                       emphasizing or covers certain body parts.
measures can be determined. The objective of this research is to                            Usually, recommender systems are applied to filter relevant prod-
investigate the predictive performance of body measures extracted                       ucts for the visitors of online stores, leveraging techniques like col-
from a 3D body scanner for predicting fashion item preferences.                         laborative filtering or content-based filtering. The central assump-
Therefore, item preferences and body scans from 200 users were                          tion of collaborative filtering is, that users having demonstrated
collected. From the body scans, 11 body measures are extracted and                      similar preferences in the past will have similar preferences in the
integrated into a prediction model using Factorization Machines.                        future [3]. Therefore, conventional collaborative filtering based
The results from a cross-validation show, that including body mea-                      recommender systems only consider the product preferences per
surements significantly improves the prediction performance of                          user in the form of ratings, views, or purchases and consequently
the recommendation model, especially in new user scenarios, when                        derive similarities between users on these data. However, due to the
no information about the fashion product preferences of the active                      wide adoption of new technologies like social networks, or mobile
user is known.                                                                          phones by the users as well as novel technologies to store and pro-
                                                                                        cessing of large datasets, rich side information about users, items,
KEYWORDS                                                                                and their interaction are available. Therefore, further research is
                                                                                        needed to investigate mechanisms for integrating such rich side
fashion product recommendation, body scan data, factorization
                                                                                        information into recommendation systems, as well as to assess the
machines, feature-driven recommendation
                                                                                        impact of such side information on the prediction quality [18].
                                                                                            Based on novel technologies, information about the users’ phys-
1    INTRODUCTION                                                                       ical appearance can be acquired. For instance, 3D models of bodies
Fashion products have the highest turnover of all product cate-                         or faces can be constructed based on low-cost 3D scanners [21] [22],
gories sold via e-commerce worldwide. Online fashion retailer offer                     or reconstructed from 2D photos [8] [5]. User studies indicate that
their consumers large product ranges. For instance, the German                          they perceive 3D scanner as useful for online shopping in general
online retailer Zalando indicates to have 150,000 products from                         [20], as well as in the fashion product context [12], as long as data
1,500 brands in their assortment 1 .                                                    protection is ensured. However, according to a recent literature
   Consumer behavior research indicates, that having a too broad                        review about apparel product recommendation research, there is a
product offer can cause a choice overload for the consumer, what                        research gap regarding the potential of body scan measurements
leads to a delayed or no final purchase decision [7], and can decrease                  to enrich consumer profiles in apparel recommendation [4].
the consumers’ satisfaction regarding their purchase decision [17].                         The objective of this research is to investigate the predictive
This effect has also been identified within fashion online purchase                     power of body measures for predicting fashion product preferences
decisions [11]. To avoid choice overload, fashion online stores                         of individual users. Therefore, a dataset containing fashion product
should limit the options for consumers in a way, so they are dis-                       preferences as well as body measurements extracted from body
played with a sufficient product variety but in the same time, they                     scans are collected, and offline experiments are conducted to com-
                                                                                        pare the impact of these measurements on the prediction accuracy.
1 https://www.zalando.de/presse zahlen-und-fakten/
                                                                                        In this paper, first the data collection and the resulting data set is
ComplexRec 2017, Como, Italy.
                                                                                        illustrated in Section 2, and preliminary results are demonstrated
2017. Copyright for the individual papers remains with the authors. Copying permitted   in Section 3. Finally, in Section 4 the results are discussed as well
for private and academic purposes. This volume is published and copyrighted by its      as planned next steps for further research are illustrated.
editors. Published on CEUR-WS, Volume 1892..
ComplexRec 2017, August 31, 2017, Como, Italy.                                                                                                                             A. Piazza et al.


2 DATA COLLECTION
2.1 Data collection process
For data collection, volunteers were recruited from the School of
Business at the University of Erlangen-Nuremberg. To obtain one
rather large sample instead of two smaller ones, data collection was
concentrated on only one gender. The decision was made to focus
on female participants, as a higher variation in body shapes are




                                                                         measurement distribution
expected as well as female participants are assumed to have stronger
                                                                                                    100
opinions according to their fashion preferences. The data collection
process consists out of two steps. In the first step, participants
were scanned using a low-cost body scanner which creates three-
dimensional polygonal mesh data. Based on the mesh data, 11
key body measurements per participant were extracted. In the
second step, the participants filled out an online questionnaire. In
the online questionnaire, the participants gave information about
their height and weight as well as indicated the color of their hair,
eyes, and skin. Also, they classified their body shape into one of
eight shapes. Finally, every participant indicated their preference                                 50
towards 36 fashion products by answering the question, whether
they would buy the displayed clothing or not on a seven-point Likert
scale (1=do not want to buy it at all; 7= would buy it definitely).
For the set of fashion products, upper an lower apparel that rather
accentuate or mask certain body features were selected.


2.2    Resulting data-set




                                                                                                                                                                                arm
                                                                                                          breast




                                                                                                                           hip

                                                                                                                                 belly




                                                                                                                                                            back
                                                                                                                                         leg out

                                                                                                                                                   leg in




                                                                                                                                                                   neckbreast




                                                                                                                                                                                      back width
                                                                                                                   waist




                                                                                                                                                                                                   calves
In total, 200 persons participated in the survey and body scanning.
The distribution of the eleven body measures are illustrated in
Figure 3. From the 200 valid scans, the average age is 22.35 and the
standard deviation 2.94 years.                                                                                                     Body measures

3 EVALUATION                                                                 Figure 1: Distribution of the eleven body measures extracted
                                                                             from the body scans (N=200)
3.1 Evaluation protocol
For determining the prediction quality, the rating is converted from
the seven-point Likert scale to a binary rating, indicating whether
a consumer would buy the product or not, by transforming ratings
≤ 5 as not interested (=0), and ratings > 5 as interested (=1). In the
following, the term rating prediction is therefore interpreted as a
classification problem. During the evaluation, the data is split up
in a test and training set using a 10-fold cross-validation, as it is
illustrated in Figure 4. Each user is randomly assigned to one of
the ten test and train user-sets. Furthermore, to simulate the new
user scenarios, the 36 items are randomly divided into 24 test and
12 train item-sets. The split is based on a random selection of items,
which is illustrated in Figure 3, where the darker gray bars indicate
the test items belonging to the test-set. For each iteration through
the ten folds, the algorithm is provided with all 36 ratings of all of       Figure 2: Exemplary polygonal meshes resulting from the
the users in the train user-set. Furthermore, to investigate the new         body scans.
user scenarios, the algorithm was provided with no, two, or five
item ratings randomly selected from the train item-set. For all of
the three scenarios, the ratings for the same 20 products of the test
item-set are predicted.                                                     are used to aggregate the F-measures from the individual cross-
    For determining the prediction quality, the metrics precision           validation results, which lead to diverging results. In this research,
(Equation 1), recall (Equation 2), and the F-measure (Equation 3) are       we use the formulation of the F-measure suggested by Forman and
used. In machine learning research, various aggregation approaches          Scholz (2010) which resulted in the least bias [2].
 Body measure-aware fashion product recommendations                                                        ComplexRec 2017, August 31, 2017, Como, Italy.


                                                                                        by the additional information. Another approach is to build mul-
                                                                                        tidimensional models, also referred as contextual models, where
                                                                                        the additional information is directly integrated into the prediction
                                                                                        model [1]. Recently, especially tensor decomposition approaches
          100
                                                                                        gained attraction, enabling the direct modeling of multi-modal infor-
                                                                                        mation. In previous research, the focus was especially on the Tucker
  value




                                                                                        decomposition [19], the Parallel Factor Analysis (PARAFAC) [6],
                                                                                        and Pairwise Interaction Tensor Factorization (PITIF) [16], which
          50                                                                            demonstrated high prediction qualities. However, the main draw-
                                                                                        back of these methods is that their adaption to non-categorical
                                                                                        factors is difficult and error-prone [14].
                                                                                           As an alternative, Factorization Machines (FM) were introduced
                                                                                        which combine the advantages of the tensor decomposition ap-
           0                                                                            proaches to make predictions in highly sparse and multi-modal
                                                                                        conditions, with the ability of support-vector-machines (SVM) to
                04
                23
                32
                29
                13
                05
                19
                20
                08
                35
                14
                03
                15
                09
                11
                07
                18
                26
                12
                27
                30
                24
                31
                34
                02
                16
                25
                10
                17
                06
                22
                21
                28
                33
                00
                01
                        Items orderd by global preference                               be a general predictor [13]. With FM, the user, rating, and additional
                                                                                        information can be modeled as feature vectors, having categorical
 Figure 3: All 32 fashion products ordered by global prefer-                            or continuous values. The target variable y can be a real-valued
 ence. The products in darker gray color are selected for the                           rating or a binary value [13]. FM have been demonstrated to be an
 test-set.                                                                              effective approach to integrate contextual information, like mood,
                                                                                        into movie recommender systems [15]. The key aspect of FM is,
                                                                                        that the interactions between the input variables are not calcu-
                         items                                                          lated directly, but a low-rank approximation is used. Within this
                                                                                        paper, the FM model shown in Equation 4 is considered, which
                                                   Fold 1                               models binary interactions between the low-rank approximations
                                                   Fold 2                               V = hvi , v j i ∈ Rn×k . The variable k ∈ N+
                                                                                                                                   0 represents the number
                                                                                        of latent variables. Appropriate values of k have to be determined
                                                   Fold 3                               empirically. On the one side, the value should be large enough so
users




                                                                 train user-set




                                                   Fold 4                               relevant interactions in the data can be captured. On the other side,
                                                   Fold 5                               restricting the value of k, and therefore its expressiveness might
                                                                                        lead to better generalization of the model. [13]
                                                   Fold 6
                                                   Fold 7                                                           n
                                                                                                                    Õ               n Õ
                                                                                                                                    Õ n
                                                                                                   ŷ(x) := w 0 +         wi xi +               hvi , v j ix i x j   (4)
                                                   Fold 8
                                                                                                                    i=1             i=1 j=i+1
                                                   Fold 9                                  In this paper, the reference implementation of FM within the
                 test item-set       train item-set Fold 10         test user-set       LibFM library is used [14]. For learning models based on FM, op-
                                                                                        timization models based on Stochastic Gradient Descent (SGD),
                                                                                        Alternative Least Squares (ALS), and Markov-Chain Monte Carlo
 Figure 4: Illustration of the applied 10-fold cross-validation
                                                                                        (MCMC) are proposed, and implemented in LibFM. We decided to
 schema.
                                                                                        use the MCMC optimization, as this approach has the less hyperpa-
                                                                                        rameter which have to be tuned.

                                       True Positive                                    4   RESULTS AND DISCUSSION
                  precision =                                                     (1)
                                True Positive × False Positive                          The resulting F-measure values are illustrated in Figure 5. The mod-
                                        True Positive                                   els were built having latent variables values of k ∈ {16, 32, 64, 128}.
                 precision =                                                      (2)   As mentioned in subsection 3.1, three new user scenarios are inves-
                                True Positive × False Negative
                                                                                        tigated, in which no, two, or five ratings of the user are known. In
                                 2 × True Positive                                      the first scenario, the models having the measurement information
           F=                                                                     (3)
                2 × True Positive + False Positive + False Negative                     of the user have a considerable better performance than the models
                                                                                        without this information. In this scenario, the best model without
 3.2        Factorization Machines                                                      information has a F-value of 0.40, whereas the best model with
 In recommender systems research, various approaches were sug-                          F-value a value of 0.49. This clearly better performance shrinks
 gested to consider additional information besides the ratings of                       in case of the second scenario, where two items are known and
 users per item. The additional information can be integrated via                       vanishes in the last scenario, where five items are known. The pre-
 pre-filtering or post-filtering, where conventional recommendation                     cision of the models having body measurement information is in all
 algorithms are applied and the input data or the results are filtered                  cases higher, but at the same time, the recall is lower compared to
ComplexRec 2017, August 31, 2017, Como, Italy.                                                                                                            A. Piazza et al.


                        0−items                2−items                5−items            [7] Sheena S. Iyengar and Mark R. Lepper. 2000. When choice is demotivating: Can
                                                                                             one desire too much of a good thing? Journal of personality and social psychology
            0.7                                                                              79, 6 (2000), 995.
                       no measure                                                        [8] Luo Jiang, Juyong Zhang, Bailin Deng, Hao Li, and Ligang Liu. 2017. 3D Face
                       with measure                                                          Reconstruction with Geometry Details from a Single Image. (2017). http://arxiv.
                                                                                             org/pdf/1702.05619
            0.6                                                                          [9] Michelle R. Jones and Valerie L. Giddings. 2010. Tall women’s satisfaction with
                                                                                             the fit and style of tall women’s clothing. Jnl of Fashion Mrkting and Mgt 14, 1
                                                                                             (2010), 58–71. DOI:http://dx.doi.org/10.1108/13612021011025438
                                                                                        [10] Jeong Yim Lee, Cynthia L. Istook, Yun Ja Nam, and Sun Mi Park. 2007. Compari-
F−measure




            0.5                                                                              son of body shape between USA and Korean women. Int Jnl of Clothing Sci &
                                                                                             Tech 19, 5 (2007), 374–391. DOI:http://dx.doi.org/10.1108/09556220710819555
                                                                                        [11] Komal Nagar and Payal Gandotra. 2016. Exploring Choice Overload, Internet
                                                                                             Shopping Anxiety, Variety Seeking and Online Shopping Adoption Relationship:
            0.4                                                                              Evidence from Online Fashion Stores. Global Business Review 17, 4 (2016), 851–
                                                                                             869. DOI:http://dx.doi.org/10.1177/0972150916645682
                                                                                        [12] Jaekyung Park, Yunja Nam, Kueng-mi Choi, Yuri Lee, and Kyu-Hye Lee. 2009.
                                                                                             Apparel consumers’ body type and their shopping characteristics. Jnl of Fash-
                                                                                             ion Mrkting and Mgt 13, 3 (2009), 372–393. DOI:http://dx.doi.org/10.1108/
            0.3
                                                                                             13612020910974500
                                                                                        [13] Steffen Rendle. 2010. Factorization Machines. In Proceedings of the 2010 IEEE
                                                                                             International Conference on Data Mining (ICDM ’10). IEEE Computer Society,
                                                                                             Washington, DC, USA, 995–1000. DOI:http://dx.doi.org/10.1109/ICDM.2010.127
            0.2                                                                         [14] Steffen Rendle. 2012. Factorization Machines with libFM. ACM Trans. Intell. Syst.
                  16   32    64   128     16   32   64   128    16    32    64    128        Technol. 3, 3 (2012), 57:1–57:22.
                                                                                        [15] Stefen Rendle, Zeno Gantner, Christoph Freudenthaler, and Schmidt-Thieme Lars.
                                      number of latent variables
                                                                                             2011. Fast Context-aware Recommendations with Factorization Machines: 34th
                                                                                             International ACM SIGIR Conference on Research and Development in Information
Figure 5: Resulting predictive performance per new user sce-                                 Retrieval ; July 24 - 28, 2011, Beijing, China. ACM, New York, NY.
                                                                                        [16] Steffen Rendle and Lars Schmidt-Thieme. 2010. Pairwise Interaction Tensor
nario.                                                                                       Factorization for Personalized Tag Recommendation. In Proceedings of the Third
                                                                                             ACM International Conference on Web Search and Data Mining (WSDM ’10). ACM,
                                                                                             New York, NY, USA, 81–90. DOI:http://dx.doi.org/10.1145/1718487.1718498
the model having no measure information. In total, the empirical                        [17] Barry Schwartz. 2016. The paradox of choice: Why more is less (revised edition
                                                                                             ed.).
results indicate, that body measures possess significant predictive                     [18] Yue Shi, Martha Larson, and Alan Hanjalic. 2014. Collaborative Filtering Beyond
power in the context of apparel recommendation, especially in                                the User-Item Matrix: A Survey of the State of the Art and Future Challenges.
                                                                                             ACM Comput. Surv. 47, 1 (2014), 3:1–3:45. DOI:http://dx.doi.org/10.1145/2556270
new users scenarios, where no previous user product preference                          [19] Ledyard R. Tucker. 1966. Some mathematical notes on three-mode factor analysis.
information are available. In practice, one situation could be when                          Psychometrika 31, 3 (1966), 279–311. DOI:http://dx.doi.org/10.1007/BF02289464
consumers are getting scanned in a store the first time.                                [20] Christian Zagel and Jochen Süßmuth. 2013. Nutzenpotenziale maßgetreuer 3D
                                                                                             Avatare aus Low-cost Bodyscannern. HMD : Praxis der Wirtschaftsinformatik 50
   Further research is needed to identify which specific body mea-                           (2013), 48–57. DOI:http://dx.doi.org/10.1007/BF03342068
sures have predictive power and which rather introduce noise to                         [21] Christian Zagel, Jochen Süßmuth, and Freimut Bodendorf. 2013. Automatische
the model. From the eleven measures used, it can be expected, that                           Rekonstruktion eines 3D Körpermodells aus Kinect Sensordaten. In Wirtschaftsin-
                                                                                             formatik Proceedings 2013. Paper 35. http://aisel.aisnet.org/wi2013/35
some measures like the hip or belly measure give more information                       [22] Michael Zollhöfer, Michael Martinek, Günther Greiner, Marc Stamminger, and
to the model than for example the calves measure. Another possi-                             Jochen Süßmuth. 2011. Automatic Reconstruction of Personalized Avatars from
                                                                                             3D Face Scans. Computer Animation and Virtual Worlds (Proceedings of CASA
bility to integrate body scan information is to assign each scan to a                        2011) 22, 2-3 (2011), 195–202.
distinctive body shape class [10], or use the principal components
from the morphable model approach [22]. In addition, the impact
of the hair and skin color nuances on the predictive performance
will be investigated in further research.

REFERENCES
 [1] Gediminas Adomavicius and Alexander Tuzhilin. 2008. Context-aware Rec-
     ommender Systems. In Proceedings of the 2008 ACM Conference on Recom-
     mender Systems (RecSys ’08). ACM, New York, NY, USA, 335–336. DOI:http:
     //dx.doi.org/10.1145/1454008.1454068
 [2] George Forman and Martin Scholz. 2010. Apples-to-apples in Cross-validation
     Studies: Pitfalls in Classifier Performance Measurement. SIGKDD Explor. Newsl.
     12, 1 (2010), 49–57. DOI:http://dx.doi.org/10.1145/1882471.1882479
 [3] David Goldberg, David Nichols, Brian M. Oki, and Douglas Terry. 1992. Using
     Collaborative Filtering to Weave an Information Tapestry. Commun. ACM 35, 12
     (1992), 61–70. DOI:http://dx.doi.org/10.1145/138859.138867
 [4] Congying Guan, Shengfeng Qin, Wessie Ling, and Guofu Ding. 2016. Apparel
     recommendation system evolution: an empirical review. International Journal of
     Clothing Science and Technology 28, 6 (2016), 854–879. DOI:http://dx.doi.org/10.
     1108/IJCST-09-2015-0100
 [5] Peng Guan, Alexander Weiss, Alexandru O. Balan, and Michael J. Black. 2009.
     Estimating human shape and pose from a single image. In Computer Vision, 2009
     IEEE 12th International Conference on. 1381–1388.
 [6] Richard Harshman. 1970. Foundations of the PARAFAC procedure: Models and
     conditions for an “explanatory” multi-modal factor analysis. UCLA Working
     Papers in Phonetics 16 (1970).