=Paper= {{Paper |id=Vol-2225/paper11 |storemode=property |title=Understanding how to Explain Package Recommendations in the Clothes Domain |pdfUrl=https://ceur-ws.org/Vol-2225/paper11.pdf |volume=Vol-2225 |authors=Agung Toto Wibowo,Advaith Siddharthan,Judith Masthoff,Chenghua Lin |dblpUrl=https://dblp.org/rec/conf/recsys/WibowoSML18 }} ==Understanding how to Explain Package Recommendations in the Clothes Domain== https://ceur-ws.org/Vol-2225/paper11.pdf
    Understanding how to Explain Package Recommendations in
                      the Clothes Domain
                      Agung Toto Wibowo                                                     Advaith Siddharthan
          Computing Science / Informatics Engineering                                    Knowledge Media Institute
           University of Aberdeen / Telkom University                                        The Open University
                  wibowo.agung@abdn.ac.uk /                                            advaith.siddharthan@open.ac.uk
               agungtoto@telkomuniversity.ac.id

                          Judith Masthoff                                                       Chenghua Lin
    Computing Science / Information and Computing Science                                     Computing Science
         University of Aberdeen / Utrecht University                                        University of Aberdeen
        j.f.m.masthoff@uu.nl/j.masthoff@abdn.ac.uk                                         chenghua.lin@abdn.ac.uk

ABSTRACT                                                                  Explanations can also take the form of social explanations. This
Recommender system explanations have been widely studied in the           type of explanation is usually delivered in the form of “users u 1 and
context of recommending individual items to users. In this paper,         u 2 also like the recommended item” [16].
we present and evaluate explanations for the more complex problem             Other explanations use items features. For example, in the movie
of package recommendation, where a combination of items that go           domain explanations can use features such as the director and
well together are recommended as a package. We report the results         actors [12], or use tags (free text describing an item) as studied in
of an empirical user study where participants try to select the most      tagsplanations [15]. In tagsplanations, tags were presented along
appropriate combination of a “top” (e.g. a shirt) and a “bottom”          with the item relevances and user preferences. The tag relevance
(e.g. a pair of trousers) for a hypothetical user based on one of         may reflect the tags’ popularity, or the correlation between the tags
5 types of explanation communicating item-feature preferences             and items, whilst user preferences measure users’ sentiment to the
and/or appropriateness of feature combinations. We found that             given tag, e.g. how much a user will like or dislike the “classic” tag.
the type of explanation significantly impacted decision time and              All the explanations discussed above were used to explain rec-
resulted in selection of different packages, but found no difference      ommendations of individual items [14]. Recently, a more complex
in how participants appraised the different explanation types.            task has been studied in the form of package recommendations
                                                                          [4, 17, 18], where combinations of items that work well together
KEYWORDS                                                                  are recommended to a user. In real world applications, package
                                                                          recommendations have advantages for both customers and sellers.
Package Explanation; Package Recommendation; Clothes Domain               For example, in the clothes domain, when a “top”(e.g. a shirt) and
                                                                          “bottom” (e.g. a pair of trousers) are recommended together, the
1    INTRODUCTION                                                         seller can boost their sales, while the customer, they can save on
Recommender systems have been widely studied in various do-               shipping costs and obtain clothes that go well together.
mains (e.g. movies, music, books) using techniques such as collabo-           To the best of our knowledge there is a lack of research on ex-
rative filtering [3, 10], content based filtering [9], and hybrid meth-   planations which deal with package recommendations. This type
ods [11]. These techniques have been used in different tasks such as      of explanation has two main challenges: it must explain both the
finding good items (top-N recommendations) [2], recommending a            individual items in the combination and the appropriateness of
sequence [1, 7] or recommending a package (also referred to as a          combining them. In this paper, within the clothes domain, we inves-
bundle) [4, 8].                                                           tigate the impact of five different types of explanation by combining
   To increase user acceptance of recommended items, recommender          three different components, namely, individual preferences, pack-
systems can provide explanations [5, 14], which also help users           age appropriateness, and natural language descriptions of package
understand why items are being selected for them [6]. Explanations        appropriateness.
can provide transparency by exposing the reasoning and data be-               The remainder of this paper is organized as follows. Section 2
hind a recommendation [5, 13], and increase user scrutability, trust,     defines the package recommendation and explanation in clothes
effectiveness, persuasiveness, efficiency and satisfaction [13, 14].      domain. Section 3 describes our motivation, participants, materials.
   Several explanations types have been investigated [14]. Explana-       Section 4 shows our results. Finally, Section 5 provides a discussion
tions have used different inputs such as content data and user-item       and suggests directions for future work.
ratings [6]. In the movie domain, a comprehensive study has been
conducted by presenting different explanation interfaces [5] such         2    CLOTHES PACKAGE EXPLANATIONS
as using group rating histograms, neighbor ratings histograms or          In this paper, we followed package definitions as described in [19].
tables of neighbor ratings. Using user-item ratings as input, Her-        Consider a set of clothes consisting of two disjoint complementary
nando et al. [6] proposed tree explanations where the nodes repre-        sets: a set of “top” items and a set of “bottom” items. Each item in
sent items, and the branches represent the distance among items.          both “top” and “bottom” is associated with a set of attributes (for
IntRS Workshop, October 2018, Vancouver, Canada.                                                                                                  Wibowo et al.



                         Blue Colour                                                                                          Green colour
                      Stripes Pattern                                                                                         Plain Pattern
                      Work formality                                                                                          Casual formality
                       Having Collar                                                                                          Cutting Shapes
                       Sleeve Length                                                                                          Bottom Length
                           Shirt Type


                 Blue top + green bottom
                 Striped top + plain bottom
                 Work formality top + casual bottom

              There are several good aspects of this combination. Firstly, the blue of the top and the green of the bottom go well together, as
              they are colours that are side by side on the colour wheel. Secondly, a striped top goes well with a plain bottom. However, it is
              a bad idea to combine a work formality top with a casual bottom.


Figure 1: Clothes explanation components for a combination of “top” and “bottom” of clothes (white cells). (a). Explanation
using top and bottom individual thumb up/down attributes (yellow cells, used in IT and IT-CT explanation types/see Table 1),
(b). Explanation using top and bottom relations using thumb up/down (green cell, used in CT and IT-CT), and (c). Explanation
using top and bottom relations using natural language description (red cell, used in CN and IT-CN).

      Table 1: Clothes Combination Explanation Types                                  Recommender system might calculate the user preferences by cor-
                                                                                      relating each attribute with the user individual ratings. The second
                                               Indiv.       Combination               component (in the green box) uses the relation between the top
  Explanation Type
                                              Thumb      Thumb Nat. Lang.             and bottom clothes in the form of appropriateness rules which we
  Indiv. Thumb (IT)                              ✓         -         -
  Comb. Thumb (CT)                               -         ✓         -
                                                                                      adopted from [19], and presents this relation using thumb up/down
  Comb. Nat. Lang. (CN)                          -         -         ✓                symbols. In this component, the thumb up/down represents the
  Indiv. Thumb + Comb. Thumb (IT-CT)             ✓         ✓         -                appropriateness/ inappropriateness of attributes from the top and
  Indiv. Thumb + Comb. Nat. Lang. (IT-CN)        ✓         -         ✓                bottom being combined together. In the real world situation, a
                                                                                      user might have an intuition and easily judge whether this compo-
                                                                                      nent correctly explain the relation among “top” and “bottom” or
                                                                                      not. Following [19], we use color, pattern and formality as com-
example colour, pattern, formality and so on). Further, some of these
                                                                                      bination features. The third component (in the red box) uses the
items and/or their combinations (a package) have received ratings
                                                                                      same appropriateness rules as the second component and manu-
from one (or more) users as individual rating and/or package rating.
                                                                                      ally describes the top and bottom combination in natural language.
Our task is then to provide explanations for selected packages.
                                                                                      We used the natural language to reduce miss-interpretation to the
   To highlight the importance of package explanations, consider a
                                                                                      provided explanations. All three explanation components can be
situation when a user looking a complementary item for a t-shirt
                                                                                      system generated, but for this study were manually generated to
(as a query) he/she like. The recommender engine might pair the
                                                                                      ensure our findings about how users appraise these components
user’s query with other complementary items as packages. These
                                                                                      were not influenced by issues pertaining to implementation quality.
packages will be better served with explanations. The explanation
                                                                                         Using these components, we designed five different explanation
also useful in a situation where a user request different packages.
                                                                                      types (see Table 1). We named the explanations using the abbrevia-
   In the literature, different explanations have used different in-
                                                                                      tion of components involved in each type. For example, IT-CN is the
termediary entities to show the relations between users and items
                                                                                      explanation which uses the individual thumb (yellow cells in Figure
[15]. Three commonly used intermediary entities are items, users,
                                                                                      1) and the combination natural language components (red cell in
and features. In this paper, we use features (see Figure 1), as they
                                                                                      Figure 1). In this study, we did not use CT and CN components
are easier to detect and explain.
                                                                                      together as they present similar information.
   Using features, we designed three different explanation compo-
nents (see Figure 1). The first component (in the yellow boxes) uses
individual attributes for both the “top” and the “bottom” clothes.
We use colour, pattern, formality, collar, sleeve length and type                     3     EVALUATION METHODOLOGY
(e.g. shirt, t-shirt or top) as features for the top, and colour, pattern,            The aim of our user study was to evaluate different package recom-
formality, cutting shapes, and length as features for the bottom.                     mendation explanation types (see Table 1) in the clothes domain on
In clothes domain we can easily extract and communicate these                         different aspects (such as effectiveness/persuasiveness, efficiency,
attributes by seeing it from each image. The thumb up/down sym-                       transparency, trust, and satisfaction).
bols indicate the user preferences (like/dislike) of the feature values.
Wibowo et al.                                                                              IntRS Workshop, October 2018, Vancouver, Canada.

Table 2: Number of images shown in the preferences sheet.                 Table 3: Positive and negative aspects’ frequency distribu-
                                                                          tion in each combination
                         # of Top Ratings   # of Bottom Ratings
           Pseudo-user
                         1 2 3 4 5          1 2 3 4 5                                                        Top           Bottom        Combination
           Mary          2 4 3 4 3          2 3 2 2 3                           Pseudo-user    Comb #
                                                                                                          Pos. Neg.      Pos. Neg.       Pos. Neg.
           Peter         3 3 2 4 3          3 2 1 3 3                                          C1          6     0        3      2        2     1
                                                                                               C2          5     1        4      1        3     0
                                                                                               C3          4     2        5      0        2     1
                                                                                Mary
3.1     Participants                                                                           C4          4     2        4      1        2     1
                                                                                               C5          6     0        3      2        3     0
In our study, 64 participants were recruited using convenience                                 C6          5     1        4      1        3     0
sampling at University of Aberdeen. Participants came from 17                                  C1          4     2        4      1        2     1
different countries (2 did not disclose their nationality). There were                         C2          3     3        5      0        3     0
38 male participants, 25 female, and 1 undisclosed. There were 10                              C3          4     2        4      1        2     1
                                                                                Peter
                                                                                               C4          4     2        4      1        2     1
participants aged between 18-25, 45 between 26-45, 3 between 41-54,
                                                                                               C5          3     3        5      0        3     0
and 1 undisclosed. The study took place in an office environment,                              C6          4     2        4      1        2     1
and was approved by the University’s Ethics Board.

3.2     Materials                                                         Dependent variables:
We created two pseudo-users, named “Mary” and “Peter”, using real               • Five perceived qualities of an individual explanation type,
data from [19]. We showed the participant some images of “top”                    each rated on a 7-point scale (see Procedure for details): (1)
and “bottom” clothes and their real ratings by the pseudo-user. We                Effectiveness, (2) Trust, (3) Efficiency, (4) Transparency, (5)
selected the clothes at random, and slightly varied the number of                 Overall quality of the explanation type.
clothes shown (see Table 2 for the number of images shown for each              • Actual efficiency: Speed of package selection t in seconds.
rating, e.g., the “3” in the far bottom right cell indicates the number         • Package selected.
of “bottom” images rated “5” by Peter shown to participants).                   • Comparative preference for two explanation types.
   We selected six combinations of tops and bottoms with the expla-
nations that the system would have generated for the pseudo-user          3.4      Procedure
for each explanation type (see Table 1). Table 3 shows for each com-
                                                                          Participants were told that the purpose of the study was to un-
bination how many positive and negative aspects were mentioned
                                                                          derstand the effectiveness of different explanations about clothes
in the explanations for the “top”, “bottom” and “combination” re-
                                                                          combinations. After providing informed consent, the study was run
spectively. For example, Figure 1 shows the explanation components
                                                                          using the following steps:
for Peter’s combination C3 which contains 4 positive attributes for
                                                                             Step 1. Participants provided demographic information (gender,
the “top” and the “bottom” each, and also 2 positive appropriateness
                                                                          age and nationality, with the option not to disclose).
aspects for the “combination”.
                                                                             Step 2. Participants were given a pseudo-user’s preferences
                                                                          about individual clothing items (see Section 3.2 and Table 2).
3.3     Experimental Design                                                  Step 3. Participants selected a combination of clothes for this
We used a mixed design. Each participant considered the two               user out of 6 combinations provided, with all combinations includ-
pseudo-users, with a different explanation type for each pseudo-          ing an explanation of the same style (see Table 1; different users
user. The explanation type used for the pseudo-users was counter-         saw different explanation styles). The decision time was recorded.
balanced, as was the order in which pseudo-users were considered.            Step 4. Participants answered six questions regarding the deci-
Four groups of 16 participants each considered different pairs of         sion process and the explanations (Question 1 on a scale of 1 to 5,
explanation types: (1) CT and CN, (2) IT and IT-CT, (3) IT and            the others from 1 to 7):
IT-CN, and (4) IT-CT and IT-CN. This enabled a within subject                (1) What rating do you think the user will give to the combi-
comparison between the explanation types in each pair, but also                  nation you have chosen? (This question was only asked to
a between-subject comparison for other pairs (such as CT versus                  enable posing the next question.)
IT-CT).                                                                      (2) How confident are you that your rating reflects the rating the
Independent variables:                                                           user would have given? (This question was used to measure
                                                                                 the impact of explanation style on effectiveness – whether
      • Explanation type: 5 types (IT, CT, CN, IT-CN, IT-CT), which
                                                                                 the explanations help users make good decisions [Efk.]1 .)
        varied in the Explanation of a package’s individual items (2
                                                                             (3) Please rate how easy it is to decide how good combinations
        values: included in graphical thumbs up/down form or not
                                                                                 would be for the user? (This question was used to mea-
        included), and the Explanation of the package’s combination
                                                                                 sure the impact of explanation style on perceived efficiency
        aspects (3 values, included in graphical thumbs up/down
                                                                                 [Efc.]).
        form, included in natural language form, or not included in
        either form). We excluded the explanation type without any        1 One can argue whether this is in fact effectiveness or persuasiveness, as it only
        explanations of either item or combination.                       measures the extent to which the participant thinks the user will agree with them,
      • Pseudo-user: two values, Mary and Peter.                          which does not necessarily make the rating correct.
IntRS Workshop, October 2018, Vancouver, Canada.                                                                                                       Wibowo et al.

                                            Table 4: Statistical comparatives between two explanation types

                                           Mean (StDev) for Expl. Type 1                                     Mean (StDev) Expl. Type 2
                                                                                                                                                               Pref.**
    Type 1 Type 2            t (s)        Efk.       Efc.      Tra.     Trust     Sat         t (s)        Efk.      Efc.       Tra.       Trust      Sat.
    CT        CN         77.4 (39.2)* 6.1 (0.9) 5.1 (1.4) 5.6 (1.2) 4.9 (1.1) 5.2 (1.6) 123.4 (55.3)* 5.8 (1.1) 4.9 (1.8) 5.4 (1.5) 4.9 (1.3) 5.0 (1.7) 4.0 (2.5)
    IT        IT-CT 114.6 (75.3) 5.6 (1.1) 4.1 (1.2) 5.3 (1.0) 5.1 (0.9) 5.4 (1.0) 140.1 (94.0) 5.3 (1.1) 4.2 (1.6) 5.4 (1.0) 4.9 (1.1) 5.5 (1.2) 3.9 (2.3)
    IT        IT-CN 91.6 (62.7)* 6.1 (0.9) 5.6 (1.1) 5.6 (0.9) 5.4 (0.6) 5.9 (0.9) 137.7 (69.5)* 5.7 (1.0) 5.1 (1.5) 5.8 (0.9) 5.1 (0.9) 6.0 (1.0) 4.9 (2.1)
    IT-CT IT-CN 85.9 (42.2)* 5.5 (1.1) 4.4 (2.2) 5.0 (1.5) 4.9 (1.6) 5.6 (1.5) 146.9 (73.6)* 5.9 (0.7) 4.4 (1.6) 5.2 (1.2) 5.1 (1.1) 5.9 (0.9) 4.0 (2.4)
    Paired T-test, * significant at p < 0.05. ** User preferences when comparing explanation types from 1 (strongly prefered type 1) to 7 (strongly prefered type 2).


    (4) The system will in future recommend clothing combinations.                        Table 5: Percentage distribution of selected combination per
        To what extent do the explanations make you understand                            explanation type
        what the system will base its recommendation for a user on?
        (This question was used to measure the impact of different                          Pseudo-user   Explanation Type       C1    C2     C3     C4     C5     C6
        explanation types on perceived transparency [Tra.]).                                              IT                     0%   13%    81%     0%     0%     6%
                                                                                                          CT                    13%    0%    50%    25%     0%    13%
    (5) To what extent would you trust the system to produce rec-                           Mary          CN                     0%   13%    75%     0%     0%    13%
        ommendation of clothing combinations for a user? (This                                            IT-CT                  0%   19%    56%     6%    13%     6%
        question was used to measure the impact of different expla-                                       IT-CN                  6%   38%    38%    13%     0%     6%
        nation types on user trust).                                                                      IT                     0%    6%    19%    31%    19%    25%
    (6) Overall, how much do you like the explanations provided?                                          CT                     0%   13%     0%    50%    25%    13%
        (This question was used to measure satisfaction [Sat]).                             Peter         CN                     0%   13%    25%    38%    13%    13%
                                                                                                          IT-CT                  0%   19%    13%    19%    13%    38%
   Step 5 Steps 2-4 were repeated for the other pseudo-user with a                                        IT-CN                  0%   50%     6%    19%    13%    13%
different explanation style.
   Step 6 Participants rated on a scale of 1-7 their relative prefer-
ence for the two explanation styles they had seen (with 1 meaning
they strongly preferred one particular style, and 7 meaning they
                                                                                          make participants slower, as long as it used the thumbs up/down
completely preferred the other style). Participants were also asked
                                                                                          rather than natural language. Combining the data from the multiple
to provide additional comment (optional) regarding the two expla-
                                                                                          rounds, we also find that participants were significantly faster with
nation styles they had seen.
                                                                                          CT than IT-CT (t-test, p<0.01), so adding the individual explanation
                                                                                          component slowed them down. Overall, participants were fastest
4     RESULTS
                                                                                          with CT. So, explanation type clearly impacts decision speed and
Do explanation types impact perceived quality? Table 4 shows                              from an actual efficiency point of view, CT performed best.
participants’ ratings of perceived effectiveness (Efk.) efficiency
(Efc.), transparency (Tra.), trust and satisfaction (Sat.), as well as                    Do explanation types impact decisions made? Table 5 shows
their explicit comparative preference rating. No significant impact                       the percentage distribution of the selected combination (see Table
of the explanation type on these perceived quality measures was                           5) per explanation type. The green cells show the most selected
found, and participants overall seemed to equally appreciate all                          combination. Interestingly, when we showed participants only one
explanation types (with participants being clearly satisfied with                         explanation component (IT, CT, or CN), the most selected combi-
all)2 . A post-hoc analysis showed that individual participants often                     nation was the same independent of explanation type. However,
did prefer one of the two explanations (and also gave higher per-                         when we showed both the individual and combination components
ceived quality ratings for that one), but varied in which type they                       (IT-CT or IT-CN), participants’ selections tended to change. For
preferred. So, overall, we did not find an impact of explanation type                     example, participants who received the IT explanation type for
on perceived quality, but wonder whether the choice of explanation                        Mary tended to select C3. The distribution changed to C2 when
type may need to be adapted to the individual user.                                       participants received IT-CN and changed slightly towards C2 and
                                                                                          C5 when participants received IT-CT. Both C2 and C5 were the
Do explanation types impact decision speed? Table 4 shows                                 combinations with 3 positive combination aspects, whilst C3 was a
the time (t) participants took to make their combination selection.                       combination with 2 positive and 1 negative combination aspects
Participants were significantly faster with CT than CN, with IT-CT                        (see Table 3). For Peter, the change is even more pronounced, with
than IT-CN, and with IT than IT-CN, implying that reading the                             a change from C4 towards C6 and C2. A posthoc Chi-square test
natural language explanations slowed them down. However, there                            of independence was performed to examine the relation between
was no significant difference between IT and IT-CT (whilst there                          explanation type (one component or two components) and the
may seem to be trend for IT to be faster, this does not hold up                           combination selected (only considering those combinations which
when combining the data from other rounds in which IT and IT-CT                           were chosen most often, namely C2, C4 and C6 for Peter, and C2
were used). So, adding the combination component did not actually                         and C3 for Mary). This test was statistically significant for both
                                                                                          pseudo-users (p<0.05). So, the explanation type impacts the deci-
2 There was a significant effect with participants preferring IT-CN to IT (z-test shows
                                                                                          sions people make, implying that explanations may either influence
the mean to be significantly above 4, p=0.04, however the perceived quality metrics
are not significant and the trend on several is in the opposite direction).               the effectiveness or persuasiveness of package recommendations.
Wibowo et al.                                                                                           IntRS Workshop, October 2018, Vancouver, Canada.


Participants’ additional feedbacks. At the end of our survey, we                       [2] Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of
asked our participants to provide additional comment. This ques-                           recommender algorithms on top-n recommendation tasks. In Proceedings of the
                                                                                           fourth ACM conference on Recommender systems. ACM, 39–46.
tion was optional and we received two valuable issues for future                       [3] Michael D Ekstrand, John T Riedl, and Joseph A Konstan. 2011. Collaborative
improvements. First, the explanations need to be presented in more                         filtering recommender systems. Foundations and Trends in Human-Computer
                                                                                           Interaction 4, 2 (2011), 81–173.
detail. For example, in IT, the sleeve length information is better                    [4] A Felfernig, S Gordea, D Jannach, E Teppan, and M Zanker. 2007. A short survey
served with short/long sleeve. Second, even though the explanation                         of recommendation technologies in travel and tourism. OEGAI journal 25, 7
types which involved CN took longer time in the package selection                          (2007), 17–22.
                                                                                       [5] Jonathan L Herlocker, Joseph A Konstan, and John Riedl. 2000. Explaining col-
process, however, some participants said this type of explanation is                       laborative filtering recommendations. In Proceedings of the 2000 ACM conference
useful to help them decide the selection process. This was expressed                       on Computer supported cooperative work. ACM, 241–250.
from a comment such as: “explanation may be useful, however, too                       [6] Antonio Hernando, JesúS Bobadilla, Fernando Ortega, and Abraham GutiéRrez.
                                                                                           2013. Trees for explaining recommendations made through collaborative filtering.
long to understand”. To handle this problem different participant                          Information Sciences 239 (2013), 1–17.
suggested to deliver CN component in different ways: “explanation                      [7] Marius Kaminskas and Francesco Ricci. 2012. Contextual music information
                                                                                           retrieval and recommendation: State of the art and challenges. Computer Science
should be more explicit, may be with bold letter for points that are                       Review 6, 2 (2012), 89–119.
stressed or colours”.                                                                  [8] Qi Liu, Enhong Chen, Hui Xiong, Yong Ge, Zhongmou Li, and Xiang Wu. 2014.
                                                                                           A cocktail approach for travel package recommendation. IEEE Transactions on
                                                                                           Knowledge and Data Engineering 26, 2 (2014), 278–293.
5    CONCLUSION AND FUTURE WORK                                                        [9] Pasquale Lops, Marco Gemmis, and Giovanni Semeraro. 2011. Recommender
                                                                                           Systems Handbook. Content-based Recommender Systems: State of the Art and
This paper provides early insights into explanations for package                           Trends (2011), 73–105. http://dx.doi.org/10.1007/978-0-387-85820-3_3
recommendations. The type of explanation had a significant impact                     [10] X. Ning, C. Desrosiers, and G. Karypis. 2015. A comprehensive survey of
on decision time, but no difference in perceived quality was found                         neighborhood-based recommendation methods. 37–76.
                                                                                      [11] Xiaoyuan Su and Taghi M Khoshgoftaar. 2009. A survey of collaborative filtering
(though there was some evidence that different people preferred dif-                       techniques. Advances in artificial intelligence 2009 (2009), 4.
ferent explanation types). We also found that different explanation                   [12] Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos. 2008.
types differently impacted the selection of packages.                                      Justified recommendations based on content and rating data. In WebKDD Work-
                                                                                           shop on Web Mining and Web Usage Analysis.
   This study was conducted in the clothes domain, so its gener-                      [13] Nava Tintarev and Judith Masthoff. 2012. Evaluating the effectiveness of expla-
alisability in other domains such as travel services needs to be                           nations for recommender systems. User Modeling and User-Adapted Interaction
                                                                                           22, 4-5 (2012), 399–439.
investigated. The study can also be extended by expanding the cur-                    [14] Nava Tintarev and Judith Masthoff. 2015. Explaining recommendations: Design
rent package explanations to incorporate items the user liked (or                          and evaluation. In Recommender Systems Handbook. Springer, 353–382.
disliked). We would also like to study package explanations more                      [15] Jesse Vig, Shilad Sen, and John Riedl. 2009. Tagsplanations: explaining recommen-
                                                                                           dations using tags. In Proceedings of the 14th international conference on Intelligent
directly in a real world situation where the explanations are for                          user interfaces. ACM, 47–56.
real users for whom the system makes recommendations.                                 [16] Beidou Wang, Martin Ester, Jiajun Bu, and Deng Cai. 2014. Who Also Likes It?
                                                                                           Generating the Most Persuasive Social Explanations in Recommender Systems..
                                                                                           In AAAI. 173–179.
ACKNOWLEDGMENTS                                                                       [17] A. T. Wibowo, A. Siddharthan, H. Anderson, A. Robinson, Nirwan Sharma, H.
We would like to thank Lembaga Pengelola Dana Pendidikan (LPDP),                           Bostock, A. Salisbury, R. Comont, and R. V. D. Wal. 2017. Bumblebee Friendly
                                                                                           Planting Recommendations with Citizen Science Data. In Proceedings of the RecSys
Departemen Keuangan Indonesia for awarding a scholarship to sup-                           2017 Workshop on Recommender Systems for Citizens co-located with 11th ACM
port the studies of the lead author. We would also like to thank the                       Conference on Recommender Systems (RecSys 2017), Como, Italy, August 31, 2017.
                                                                                      [18] A. T. Wibowo, A. Siddharthan, C. Lin, and J. Masthoff. 2017. Matrix Factorization
participants who provided precious feedback during our study.                              for Package Recommendations. In Proceedings of the RecSys 2017 Workshop on
                                                                                           Recommendation in Complex Scenarios co-located with 11th ACM Conference on
REFERENCES                                                                                 Recommender Systems (RecSys 2017), Como, Italy, August 31, 2017. 23–28. http:
                                                                                           //ceur-ws.org/Vol-1892/paper5.pdf
 [1] Da Cao, Liqiang Nie, Xiangnan He, Xiaochi Wei, Shunzhi Zhu, and Tat-Seng
                                                                                      [19] Agung Toto Wibowo, Advaith Siddharthan, Judith Masthoff, and Chenghua Lin.
     Chua. 2017. Embedding Factorization Models for Jointly Recommending Items
                                                                                           2018. Incorporating Constraints into Matrix Factorization for Clothes Pack-
     and User Generated Lists. In Proceedings of the 40th International ACM SIGIR
                                                                                           age Recommendation. In Proceedings of the 26th Conference on User Modeling,
     Conference on Research and Development in Information Retrieval. ACM, 585–594.
                                                                                           Adaptation and Personalization. ACM.