=Paper= {{Paper |id=Vol-2682/paper2 |storemode=property |title=Featuristic: An interactive hybrid system for generating explainable recommendations - beyond system accuracy |pdfUrl=https://ceur-ws.org/Vol-2682/paper2.pdf |volume=Vol-2682 |authors=Sidra Naveed,Jürgen Ziegler |dblpUrl=https://dblp.org/rec/conf/recsys/Naveed020 }} ==Featuristic: An interactive hybrid system for generating explainable recommendations - beyond system accuracy== https://ceur-ws.org/Vol-2682/paper2.pdf
    Featuristic: An interactive hybrid system for generating
   explainable recommendations – beyond system accuracy
                                          Sidra Naveed                                       Jürgen Ziegler
                                   University of Duisburg-Essen                       University of Duisburg-Essen
                                       Duisburg, Germany                                   Duisburg, Germany
                                    sidra.naveed@uni-due.de                            juergen.ziegler@uni-due.de


ABSTRACT                                                                              improving the accuracy of predictions, by mostly using ratings
Hybrid recommender systems (RS) have shown to improve                                 provided by users for items. Recently, with the increasing
system accuracy by combining benefits from the collaborative                          complexity of RS algorithms, the user-oriented aspects have
filtering (CF) and content-based (CB) approaches. Recently,                           gained more attention from the research community. It has
the increasing complexity of such algorithms has fueled a                             been shown that improving these aspects lead to a commen-
demand for researchers to focus more on the user-oriented                             surate level of user satisfaction and user experience with the
aspects such as explainability, user interaction, and control                         system [18, 33].
mechanisms. Even in cases, where explanations are provided,
                                                                                      One of these aspects that may contribute to the actual user
the systems mostly fall short in explaining the connection be-
                                                                                      experience is the degree of control users have over the system
tween the recommended items and users’ preferred features.
                                                                                      and their preference profiles [16, 14, 18]. Yet, from a user’s
Additionally, in most cases, rating or re-evaluating items is
                                                                                      perspective, today’s automated RS such as the ones used by
typically the only option for users to specify or manipulate
                                                                                      Amazon [19] or Netflix [3], provide limited ways to influence
their preferences. With the purpose to provide advanced expla-
                                                                                      the recommendation generation process. Usually, the only
nations, we implemented a prototype system called Featuristic,
                                                                                      means to actively influence the results is by rating or re-rating
by applying a hybrid approach that uses content-features in
                                                                                      single items, which raises the risk of users being stuck in a
a CF approach and exploits feature-based similarities. Ad-
                                                                                      "filter bubble" [6, 29, 42, 28]. This effect makes it difficult for
dressing important user-oriented aspects, we have integrated
                                                                                      users to explore new areas of potential interest and to adapt
interactive mechanisms into the system to improve both pref-
                                                                                      their preferences towards the situational needs and goals [25].
erence elicitation and preference manipulation. Besides, we
have integrated explanations for the recommendations into                             Additionally, another problem can be seen in the general lack
these interactive mechanisms. We evaluated our prototype                              of explainability in most of the current RS which could nega-
system in two user studies to investigate the impact of the                           tively impact user’s subjective system assessment and overall
interactive explanations on the user-oriented aspects. The                            user experience. For instance, lack of explanations could re-
results showed that the Featuristic System with interactive ex-                       sult in the difficulty of understanding recommendations which
planations have significantly improved users’ perception of                           maybe a hindrance for users to make their decisions [41, 22].
the system in terms of the preference elicitation, explainability,                    These aspects consequently negatively effect the overall user
and preference manipulation – compared to the systems that                            experience with the system [33]. Moreover, it is often un-
provide non-interactive explanations.                                                 clear to users that how their expressed preferences actually
                                                                                      correspond to the system’s representation of the user model i.e.
Author Keywords                                                                       how manipulating the preference model affects the system’s
Hybrid Recommender System; Explanations; Interactive                                  output [36, 46]. Hence, adding more interactivity to the sys-
Recommending; User Experience                                                         tem by letting users influence their recommendation processes
                                                                                      and preference profiles is considered a possible solution in
CCS Concepts                                                                          RS research to improve the system’s explainability [16, 14,
•Information systems → Recommender systems;                                           18]. In this regard, only presenting users with the matching
                                                                                      recommendations is not very supportive and it has been ob-
INTRODUCTION                                                                          served that users require additional information and interactive
Recommender systems (RS) based on Collaborative Filtering                             mechanisms to fully benefit from the system [43].
(CF) or Content-based (CB), have been mainly focusing on
                                                                                      To address the limitations of state-of-the-art CF and CB ap-
                                                                                      proaches, limited hybrid approaches exist that focus on user-
                                                                                      oriented aspects and user experience, beyond the algorithmic
                                                                                      accuracy [6, 20, 30]. But such approaches are still limited in
                                                                                      terms of providing explanations, as the connection between
                                                                                      the recommended items and the user’s preferences for item-
Copyright (c) 2020 for this paper by its authors. Use permitted under Creative Com-
                                                                                      features are not clearly explained to the user. Additionally,
mons License Attribution 4.0 International (CC BY 4.0).                               these systems rarely explore whether a combination of expla-
IntRS ’20 - Joint Workshop on Interfaces and Human Decision Making for Recom-
mender Systems, September 26, 2020, Virtual Event
nations with interaction tools, has a positive influence on the    algorithm tries to predict the missing ratings of the item, which
user-oriented aspects or not.                                      have not been rated by the users, yet based on, for instance,
                                                                   the weighted average of the ratings provided by similar users
In this paper, we implemented an interactive hybrid system         (user-based CF) or of similar items (item-based CF). Explain-
called Featuristic in the domain of digital cameras, that ex-      ing these predictions to users is sometimes, very complicated
ploits content-features in a CF approach. The recommenda-          and might be difficult for users to understand. Herlocker et
tions and corresponding explanations are generated based on        al. recognized this problem and compared 21 different ex-
users that are similar to the current user in terms of shared      planation interfaces for CF, for getting an understanding of
feature-based preferences. The implemented approach is in-         how users with similar tastes rated recommended items [13].
spired from the approach proposed in [26]. We exploited            Their study indicated that users preferred rating histograms
multiple data sources to provide explainable recommendations
                                                                   over other explanation styles. Numerous attempts have been
rather than relying only on item-ratings (CF approach) or item-
                                                                   made to increase the transparency of the RS through visual
features (CB approach). We further integrated these advanced
                                                                   explanations such as; flowcharts [15], Venn diagrams [30],
explanations with interactive mechanisms with the purpose to
                                                                   graph-based representations [45], clustermaps [45], concen-
improve the proposed prototype system with respect to three        tric circles [17, 27], paths among columns [6], and map visual-
main user-oriented aspects: 1) Preference elicitation process      izations [23, 10]. Approaches such as, PeerChooser [27] and
2) Explainability of recommendations and, 3) Preference ma-        SmallWorlds [11] presented complex interactive visualizations
nipulation of users. In this regard, we aim at addressing the      with the aim to explain the output of CF: similar users are
following research question:                                       displayed by means of connected nodes, where the distance
RQ: Does integration of the hybrid-style explanations with         between the nodes reflects the similarity between two users.
interaction tools improve the preference elicitation, explain-     Hybrid approaches have emerged to benefit from both CF
ability of recommendations, and preference manipulation for        and CB approaches when generating recommendations and
users – compared to a conventional filtering system with sim-      its corresponding explanations [7]. Some of these approaches
ple and non-interactive explanations?                              combined ratings with content features [37, 12], and others
To address the research question, we ran a user study in which     have additionally taken social data into account [6, 30, 44, 34].
we evaluated the Featuristic System with advanced interac-         However, these systems rarely focus on making the recom-
tive explanations, against conventional filtering approach with    mendation process more transparent and explainable. In cases
rather simple and non-interactive explanations. In a subse-        where they attempt to provide explanations, these explanations
quent study we successfully evaluated the results of our first     are mostly presented visually. A prominent example is Talk
study, by isolating the affect of the underlying algorithms and    Explorer [44] which uses cluster maps allowing the user to
only focusing on the affect of interactive explanations on the     explore the connections of conference talks to user bookmarks,
user-oriented aspects. For this purpose, we compared two           user tags, and social data. SetFusion [30] is the hybrid system
versions of our prototype systems with or without interactive      which is based on TalkExplorer which uses Venn diagrams
explanations.                                                      instead of cluster maps.

RELATED WORK                                                       The aspects of user control and interactivity have also been
Among other user-oriented factors, increasing the transparency     integrated in the hybrid systems. A common example of such
of the RS has proved to improve the perceived recommenda-          systems is Tasteweights [6] that exploit social, content, and
tion quality, decision support, trust, overall satisfaction and    expert data to provide interactive music recommendations.
higher acceptance of the recommendations [47, 33, 41, 22].         The system not only visually presents the relation between the
Several studies have investigated the aspect of transparency, by   user profile, data sources, and recommendations but it also
comparing different explanation styles [4], combining differ-      allows the user to manipulate their recommendation process
ent explanation styles [38], considering factors like personal-    by changing weights associated with individual items and by
ization [39, 40], tags [46], rankings [24], and natural language   expressing their relative trust for each context source. These
presentations [9]. However, the current RS often lacks in ex-      interactions are dynamically reflected in the recommendation
plaining to users; how a system generates recommendations          feedback in real time. In the same context, MyMovieMixer
and why it recommends certain items [35, 41].                      [21] is the hybrid approach that allow users to control their
                                                                   recommendation process. The system provides immediate
In the context of CB approaches, for instance, item attributes     feedback, highlighting the criteria used to generate the recom-
can be used to textually explain the relevance of recommended      mendations. MoodPlay [1] is an other example that combines
items to the users’ personal preferences, though it requires       content- and mood-based data for recommending music. Rec-
availability of content data. The most common example of           ommendations and an avatar representing the user profile is
such explanations is Tagsplanation where the recommended           displayed in terms of visualization, enabling the user to under-
movies are explained based on the user’s preferred tags, ex-       stand why certain songs are recommended by means of the
plaining how the movies are relevant to these users’ preferred     position in the latent space, presenting the relation to different
tags [46]. Billsus et al. [5] proposed a news RS where the         moods, and allowing the user to influence the recommendation
explanations are presented by means of textual keywords.           process by moving the avatar [1]. While these works have
                                                                   attempted to increase the transparency, user control, and inter-
In case of conventional CF approaches, users and items are
                                                                   active mechanisms, mostly including advanced visualizations,
represented through vectors containing the item-ratings. The
they usually fall short of explaining the connections between       item has mixed data type features. On the other hand, an
user preference profile in terms of item-features and the rel-      entropy-based feature-weighting method is also limited in
evance of recommended items to this profile. Additionally,          terms of computing the relevance between two continuous
users are provided with limited mechanisms to modify their          features with mutual information due to the problem of loss
preference profile or manipulate their recommendation pro-          of information during the process of discretization in order to
cess – mostly in terms of rating or re-rating items. Current        transform non-nominal to nominal data [48].
work aims to focus on the user-oriented aspects by combin-
ing the advanced hybrid-style explanations with interaction         To overcome the limitation of entropy-based feature-weighting
mechanisms.                                                         method, we applied ordinal regression model, which can pre-
                                                                    dict an ordinal dependent variable (i.e. item-ratings in terms
A HYBRID SYSTEM BASED ON FEATURE-BASED SIMI-
                                                                    of five-point likert scale) given one or more categorical or con-
                                                                    tinuous independent variables (i.e. item-features). The model
LARITY                                                              is able to determine which of the features have a statistically
Following steps are used to implement the hybrid approach and       significant effect on the item-ratings. The model allows to
are briefly discussed here. 1) Creating feature-based profile of    determine, how much of the variation in item-ratings can be
the current user 2) Creating other users’ feature-based profiles    explained by item-features and also, the relative contribution
– implicitly predicted from their item-ratings 3) Computing         of each feature in explaining this variance. The steps applied
user-user similarities based on shared feature preferences 4)       for ordinal regression model are briefly described below.
Generating recommendations and corresponding explanations
from similar users’ feature-based preferences.                      Selecting specific features for the model
                                                                    When constructing a regression model, it is important to iden-
Creating feature-based profile of the current user                  tify the predictor variables (item-features) that contribute sig-
In the first step, a feature-based profile of the current user is   nificantly to the model. To do so, correlation of the item-
required to be used in a feature-based CF approach. For this        features with the item-ratings are computed on the overall
purpose, first the user is required to select the feature-value     ratings dataset, by applying Spearman’s rank-order correlation.
and then must specify how important this value is for him/her       The top 15 features with highest significant correlations with
in terms of five-point likert-scale (from "not important" =         the ratings are further considered for the model.
0; "very important" = 1). For binary features e.g., WLAN,
selecting a feature and giving it an importance scale, will add     Predicting ratings from features
this feature in a user vector. In case of continuous features       In the next step, PLUM procedure is used in SPSS to apply an
such as Pixel Number, the user can select any range-value           ordinal regression model2 . For each user in the dataset, the
and select the importance scale, which will be discussed in         model was applied separately, taking only values into account
the section "Similarity between users in terms of continuous        which have a significant correlation with the user ratings.
features with range-value categories", specifying how these
values will be mapped and saved in the user vector.                 Interpreting the output
                                                                    For each user, we want to determine which features have a
Additionally, we used the knowledge based data from a camera
                                                                    statistically significant effect on the item-ratings. For this pur-
website1 to identify the set of features which are important
                                                                    pose, parameter estimates table is used to interpret the results
for the five most common photography modes i.e., sports,
                                                                    and identify the features and its values that have statistically
landscape, Filming, street, and portrait photography. The
                                                                    significant effect in predicting the item-ratings, as well as the
current user can explore any photography mode in terms of
                                                                    contribution of each feature-value in predicting this rating.
the pre-defined set of features associated with each mode. The
current user can select one of the photography modes, with
an option to exclude any feature from the features-set for that     Computing      user-user     similarity   based     on   feature-
mode or can add the entire set of features directly into his/her    preferences
preference profile as part of the mode. To increase the control     The feature-based profile explicitly created for the current user
over the system and to enable users to adjust their profile at      and implicitly computed using ordinal regression for all other
any time, both the feature values and the importance scores         users, are then used to identify peer users with similar taste
can be adjusted.                                                    in item-features as that of the current user. As the camera
                                                                    features are of mixed data type, categorical and continuous –
Predicting feature-weights for users using ordinal regres-          separate measures have been considered, which take the data
sion model                                                          type into account when computing similarity between two
The second step is to compute feature-based profiles of all         users and is further explained below.
other users by implicitly predicting from item-ratings. There
are several techniques proposed in the literature to predict        Similarity between users in terms of categorical features
feature-weightings from item-ratings including TF-IDF (Term         To compute similarity between two users in terms of categor-
Frequency- Inverse Document Frequency) method and entropy-          ical feature-values and their corresponding weightings, we
based feature-weighting method proposed in [8, 2]. On one
                                                                    2 The technical details and steps applied in SPSS for PLUM pro-
hand, the TF-IDF does not provide satisfactory results as the
                                                                    cedure can be found in the link: https://statistics.laerd.com/spss-
1 https://cameradecision.com/                                       tutorials/ordinal-regression-using-spss-statistics-2.php
applied Mean Squared Error (MSE) which provides a quan-                      similarity), we applied post-filtering mechanisms in three-step
titative score describing the degree of dissimilarity between                process to generate a final list of recommendation.
two profiles.                                                                Gower’s similarity measure for categorical features
Similarity between users in terms of continuous features with                To compute similarities between the current user’s preferred
range-value categories
                                                                             features and the potential items in terms of categorical fea-
                                                                             tures, we applied Gower’s similarity measure that takes
In case of continuous features with range-values, the tradi-
                                                                             the type of variables into account. Details of the method
tional similarity measures fail to address the question that
                                                                             can be found in [31]. Let the current user be defined by
whether the partial presence of the range-value be treated as
                                                                             cu = {cu f | f = 1, 2, ..., F} and the item is defined by item =
presence or absence of the feature or not? To address this
                                                                             {item f | f = 1, 2, ..., F}. The similarity between two profiles
issue, we computed similarity between two user vectors in
                                                                             is computed using the Gower’s similarity measure using the
terms of the continuous features with range-value categories,
                                                                             formula:
in a two step process.
                                                                                                         ∑Ff=1 s(cu,item) f ∗ δ(cu,item) f
1) Percentage similarity measure: For applying regression                                  S(cu,item) =                                    (2)
                                                                                                               ∑Ff=1 δ(cu,item) f
model, the continuous values are categorized into fixed pre-
defined bins, where each binned category gets different                      The similarity coefficient δ(cu,item) f determines whether the
weights for the respective user (section "Predicting feature-                comparison can be made for the f-th feature between cu and
weights for users using ordinal regression model"). As the ac-               item which is equal to 1 if comparison can be made between
tive user can select any customized range value that might not               two objects for the feature f and 0 otherwise. s(cu,item) f is the
exactly correspond to these binned categories, we expressed                  similarity coefficient that determines the contribution provided
the customized range selected by the active user, as a percent-              by the f-th feature between cu and item, where the way this
age at which it is expressed in each binned category. If the                 coefficient is computed depends on the data type of features
range-value is completely covered by a binned category, then                 i.e., categorical and numeric. In case of categorical features
it is assigned a value of 1, and 0 if it is not covered at all. For          i.e., nominal or ordinal, the coefficient gets a value 1 if both
partially covered range value in a binned category, the percent-             objects have observed the same state for the feature f and is 0
age similarity is computed using one of the given formulas by                otherwise.
matching each condition:
                                             v j −vi                        Linear modification of Gower’s similarity measure for continu-
   i f vmin < vi < v j < vmax ; icu, f = vmax −vmin ∗ rcu, f
   
                                                                            ous features
                                                   v j −vmin
      elsei f vi < vmin < v j < vmax ; icu, f = vmax    −vmin ∗ rcu, f
                                                                             The second step of the post-filtering process for item recom-
   
                                                  vmax −vi                  mendations is to compute the similarity between the current
      elsei f vmin < vi < vmax < v j ; icu, f = vmax    −vmin ∗ rcu, f
   
                                                                             user (cu) and the item in terms of continuous features, where
                                                                       (1)   the cu has a range-value and the item has one discrete value for
Here [vi , v j ] are the range values selected by the current user           the feature f. In this case, the Gower’s coefficient of similarity
(cu) and [vmin , vmax ] are the minimum and maximum values of                s(cu,item) f for the numeric feature fail to address the issue as it
the binned category. To compute the importance weighting                     takes only one distinct value for each object [31].
icu, f of each binned category for the current user, we multiplied           To deal with this limitation, we proposed a linear modifica-
the computed percentage similarity with the current user’s                   tion of Gower’s similarity coefficient s(cu,item) f by computing
feature-specific weight for the selected range rcu, f .                      a similarity score that is linearly decreasing with a feature-
2) Applying MSE on percentage similarity : Once the current                  value’s distance from the user’s desired range if it is outside
user’s range values are mapped in terms of percentage at which               this range. The idea is to assign a similarity score to the feature
it is expressed for each binned category, then the dissimilarity             of the item depending on how close the value is to the active
between current user and other user in terms of categories                   user’s selected range. Let v be the distinct value of feature f
defined by range-values, is computed by applying MSE on                      in an item, [vi , v j ] is the min and max values of range selected
these computed values.                                                       by an active user, and [vmin , vmax ] are the min and max value
                                                                             available in the dataset for the feature f. The linear function
Generating item recommendations                                              for Gower’s similarity coefficient s(cu,item) f is then computed
The final dissimilarity score between the active user and                    using one of the given formulas by matching each condition:
the other in terms of categorical and continuous features is
                                                                                i f vmin < v < vi ; s(cu,item) f = vv−v
                                                                                                                        min
computed by taking the average of the scores computed in                     
                                                                                                                    i −vmin
section "Computing user-user similarity based on feature-
                                                                             
                                                                                elsei f vi < v < v j ; s(cu,item) f = 1
preferences". The 10 users with lowest MSE scores are con-                   
                                                                                                                             vj
                                                                                                                                + 1) + ( −v )
                                                                             
sidered for the recommendation process. From these similar                   elsei f v < v < v ; s
                                                                                        j         max    (cu,item) f   =(
                                                                                                                       vmax −v j        vmax −v j
users’ profiles, the highest rated items are considered as poten-                                                                             (3)
tial list of recommendation. However, to filter out the items
from this list, that not only matches the active user’s feature              The final user-item similarity score for current user’s all se-
preferences (user-item similarity) but also matches the feature              lected features is then computed by putting the values of the
requirements for the preferred photography mode (item-mode                   respective similarity coefficient s(cu,item) f for categorical and
numeric features (computed in section "Gower’s similarity           System visually explains how users are similar to the current
measure for categorical features" and "Linear modification          user in terms of shared feature preferences (Figure 1d) and
of Gower’s similarity measure for continuous features") and         how recommendations are generated based on similar users’
δ(cu,item) f in equation 3 and the top 10 items are then selected   feature-based profiles.
for recommendation.                                                 Most of the current RS do not provide any insight into the
                                                                    distribution of the feature-values in the feature-space or even
FEATURISTIC: PROTOTYPE AND INTERACTION POSSI-                       the availability of the offered items distributed over the feature-
BILITIES                                                            space. This might be useful for users to detect relevant features
To implement the prototype system based on the method de-           and to inform their own decision by thoroughly narrowing
scribed in section 3, we collected our own explicit item-ratings    down the list of items based on the item-features. In Featuristic
data set. For this purpose, we conducted an online study on         System, this aspect is integrated by showing the distribution
Amazon Mechanical Turk (AMT) 3 users by providing them              of feature-values selected by similar users (Figure 1d). Then,
with 60 digital cameras where each camera was described in          the recommended items are mapped on top of this distribution
terms of a list of 90-95 features extracted from a website with     (Figure 1e). This visually explains how the recommended
editorial product reviews 4 . Each participant was asked to         items are generated from similar users’ feature-based profiles,
evaluate at least 20 cameras in terms of five-star rating based     as most of the recommended items lie within most preferred
on the available features, which resulted a total of 5765 ratings   feature-values by similar users.
on 60 cameras by 150 users. The implemented prototype sys-
tem called Featuristic is shown in Figure 1, which extends the      As in the Featuristic System, users can indicate their prefer-
conventional CF and CB approaches in terms of three main            ences for one of the five photography modes – the approach
aspects as described below:                                         also considers the features-set for the selected mode in com-
                                                                    puting similar users. For each item, the suitability score for
Preference Elicitation                                              each mode is computed and can be explored by clicking on
Conventional CF or CB approaches, mostly elicit users’ pref-        the "suitability for other modes" which opens a bar chart in
erences for items in terms of rating or re-rating single items.     a pop-up window (Figure 2b). Clicking on any bar would
The filtering process of such approaches often assumes that all     expand the window with explanation of how the scores are
features are equally important for users and does not take that     computed in terms of one-to-one comparison of features of
aspect into account. In the Featuristic System, we elicit the       items with the required features of the mode.
new user’s preferences for item-features by explicitly asking
the user to select the preferred feature-values and indicates the   Manipulation of Preferences
importance of the feature-value using the importance slider         In most conventional CF approaches, the only way for users
(Figure 1a). This enables users to specify their preferences        to indicate or modify their preferences is by (re)rating items.
more precisely, especially in high-risk domains, e.g., digital      In case of the filtering systems, users can specify their pref-
cameras, where the features of items play a vital role in users’    erences by selecting the desired value or value-range for a
decision-making processes. The system further assists users         specific attribute of the items. In complex domains e.g., digital
in indicating their preferences more clearly especially, when       cameras where users mostly lack precise knowledge of the do-
users have limited domain knowledge, their preferences are          main, providing explanations can be considered an important
not defined, or they are unaware of the context in which the        factor. On the other hand, providing interactivity and direct
camera can be used. This is done by providing users with an         manipulation within an explanation might offer users a flexible
option to indicate their preferences for one of the five most       and comprehensible way to manipulate their preferences.
common photography modes (Figure 1b). The system pro-
vides users with features-set along with the suggested values,      In this respect, the Featuristic System integrates sliders (for
explaining why these features with certain values are important     continuous features) and toggle buttons (for binary features)
for a particular mode (Figure 2a).                                  with the explanations (Figure 1g), to facilitate the direct ma-
                                                                    nipulation of preferences from the system provided explana-
Explainable Recommendations
                                                                    tions. The interactive explanations are further combined with
                                                                    recommendations – visually showing the location of the rec-
Current CF or CB approaches fail to explain the connection
                                                                    ommended items distributed over the feature-space (Figure
between recommended items and the user’s preferences of
                                                                    1e). The system allows the users to manipulate their prefer-
item-features. This is addressed in the Featuristic System by
                                                                    ences directly from the explanations, by either changing the
showing a table that compares the features of each recom-
                                                                    feature-value or feature-rating – which results in dynamically
mended item with the user’s preferred features (Figure 1c).
                                                                    updating recommendations.
Additionally, it is mostly unclear to users how their expressed
feature preferences actually correspond to the system’s rep-        EMPIRICAL EVALUATION
resentation of their preference models. Even in cases, when         To investigate the impact of the explanation method developed
the explanations are provided, the rationale behind recom-          when integrated with interaction mechanisms, on user oriented
mendations is mostly not explained to users. The Featuristic        aspects, we designed a user study. Accordingly, we formulated
3 https://www.mturk.com/                                            the hypotheses with respect to user-oriented aspects focusing
4 https://www.test.de/                                              on, preference elicitation (H1), explainable recommendations
Figure 1: Screenshot of the Featuristic system. Filtering Area for selecting features (a) and choosing modes (b); One-to-one
comparison of the recommended item with the user’s selected feature-values (c); Graphical explanations showing the comparison
of the current user’s shared feature preferences with similar users (d); Recommended items mapped on top of the similar users’
selection (e); Sliders to modify importance of feature-value (f); Sliders and toggle buttons to modify feature-value (g).




Figure 2: (a) shows the list of features along with an explanation of why these features are required for the photography mode, (b)
shows the suitability scores of all modes for the recommended item based on the available features in the item.

(H2a and H2b), preference modification (H3a and H3b), and           • H3a: More direct manipulation of user preferences
user experience (H4).
                                                                    • H3b: More controllable manipulation of user preferences
Hypotheses:                                                         • H4: An improved user experience
Integrating the feature-based CF style explanations with inter-
action tools when compare to a conventional filtering system,       User Study 1
leads to:                                                           To address our hypotheses, we conducted an online crowd-
                                                                    sourced study via Prolific5 . In this study, the Featuristic Sys-
• H1: More concrete preference elicitation
                                                                    tem that provides advanced interactive explanations is com-
                                                                    pared with the conventional Filtering System that only provides
• H2a: Better explained recommendations
                                                                    simple and non-interactive explanations.
• H2b: More comprehensible recommendations                          5 https://www.prolific.co/
                                 Table 1: Self-Created items used for the constructs during the user study.

 Construct                Self-Created Items


 Preference Elicitation   - The system allows me to indicate my preferences for the camera-features efficiently.
                          - The system allows me to indicate my preferences for the camera-features precisely.
                          - The system allows me to specify how important the specific camera-features are to me.
 Understandability        - The information provided for the recommended cameras is easy to understand.
                          - Overall, I find it difficult to understand the information provided for the recommended cameras.
 Decision Support         - The information provided helps me decide quickly.
                          - Overall, I find it difficult to decide which camera to select.

 Direct Manipulation      - Seeing other users feature-selection helps me in modifying my preferred features.
                          - I am able to determine suitable feature-values for my selection.
                          - I am confident in modifying my selected feature-values.
                          - I am able to directly compare features present in given recommendations with features that other users have selected.
                          - I am able to directly see the recommended cameras that lie within my feature selection.


Table 2: Mean values and standard deviations for the subjective system assessment of the two conditions. Significant differences
are marked by *. Higher values (highlighted in bold) indicate better results.

                                                   User Study 1 (df=54)                                Follow-up User Study (df=36)


           Construct                 Featuristic    Conventional Filtering                   Featuristic   Non-Interactive Featuristic

                                      M     SD       M            SD               p          M     SD      M             SD               p
   Preference Elicitation (H1)       3.90   0.65    3.67         0.85           .032*        3.80   0.85   4.02          0.64            .006*

      Transparency (H2a)             4.26   0.50    3.82         0.89           .003*        3.89   0.80   3.89          0.85            >.999
 Information Sufficiency (H2a)       3.27   0.87    2.94         0.85           .019*        4.02   0.56   3.10          1.01            <.001*

    Understandability (H2b)          3.43   0.99    3.64         0.91            .069        3.52   0.95   3.48          1.04             .760

    Decision Support (H2b)           3.10   1.03    3.15         1.04            .734        3.18   1.00   3.33          1.00             .260

   Direct Manipulation (H3a)         3.83   .52     3.67         0.58           .034*        4.05   0.47   3.76          0.60            .016*
      User Control (H3b)             3.93   .74     3.93         0.70           >.999        4.40   0.36   3.97          0.81            .003*



Method                                                                       first asked to indicate their preferences in terms of features
The study was conducted in a within-subject design, where                    according to the task scenario. The system then generates
participants were presented with two prototype systems in a                  recommendations and corresponding explanations. Partici-
counter-balanced order:                                                      pants were required to explore the system recommendations
                                                                             and each of its presented explanations and functionality in
• Featuristic System: The interface design of the system                     order to understand the rationale behind the recommendations
  is depicted in Figure 1. The interaction possibilities are                 and its explanations and select camera(s) that matches their
  further described in the section "Featuristic: Prototype and               preferences according to the task scenario. After interacting
  Interaction possibilities".                                                with each system, they were then asked to evaluate the system
                                                                             by answering series of questions.
• Conventional Filtering System: The system allowed par-
                                                                             Participants and Questionnaire. A total of 55 Prolific users
  ticipants to indicate preferences in terms of features by
                                                                             were recruited online (19 females) with age ranging from 18-
  simply selecting feature-values. The system then gener-
                                                                             54 years (M = 28, SD = 8.7). The study completion time was
  ates recommendations and explanations only showing the
                                                                             recorded approximately 15-20 minutes. To address our hy-
  comparison of recommended items with the participants’
                                                                             potheses, we mostly used the self-created items to evaluate
  selected features and values (Figure 3A).
                                                                             both systems in terms of the above mentioned three aspects
In each of the two resulting conditions, participants were pro-              and are shown in Table 1. For Preference Elicitation, we used
vided with the same task scenario. In the system, they were                  the self-created items. The aspect of Explainable Recommen-
dations was measured in terms of two sub-aspects i.e. Explain-      two systems. The Featuristic System received the following
ability (H2a) and Comprehensibility (H2b). For Explainability,      scores: 0.66 for pragmatic quality (Bad), 0.38 for Hedonic
we used the items related to Transparency and Information           Quality (Bad), and 0.53 Overall (Bad). On the other hand,
Sufficiency from [32]. For Comprehensibility, we used our           the Filtering System received the scores: 0.99 for Pragmatic
self-created items related to Understandability and Decision        Quality (Below average), 0.15 fro Hedonic Quality (Bad), and
Support. Furthermore, the aspect of Preference Modification         0.58 Overall (Below Average). Yet, we can not accept this
was measured in terms of self-created items specifically re-        hypothesis.
lated to the interactive mechanisms allowing the participants
to directly manipulate their preferences (H3a). Additionally,       Moreover, participants indicated their likes/dislikes for each
we used items for User Control (H3b) taken from [32]. All           system. When asked about the Filtering System, majority of
                                                                    participants liked the system because of its simple and clean
questionnaire items were rated on a 1-5 Likert response scale.
                                                                    design which is easy to understand and use the system and its
Additionally, to test our fourth hypothesis, we used the short      functionalities. In comparison to the Featuristic System, some
version of User Experience Questionnaire (UEQ) (7-point             participants indicated their dislike about the Filtering System
bipolar scale ranging from -3 to 3). For qualitative feedback,      in terms of not being able to indicate the importance for the
we provided open-ended questions asking the participants            feature-values. For some participants the reason for not liking
about their likes and dislikes for both systems in terms of the     the Filtering System is because it does not show the graphs of
information provided on the interfaces.                             features or does not include reviews from other people. On
Results                                                             the other hand, when asked about the likes and dislikes for the
Hypothesis 1. To test our hypothesis, we conducted a one-           Featuristic System, majority of participants liked the system
way repeated measure ANOVA (α = 0.05), revealing that Fea-          because the system was clear, precise, intuitive, and innovative.
turistic performed significantly better than the Conventional       Many participants liked the graph comparisons, where one
Filtering System for Preference Elicitation. Therefore, we          participant indicated that "The graphs feel like I have a more
can accept our H1, indicating that Featuristic leads to more        accurate decision", the other stated that: "The graphs and the
concrete preference elicitation (Table 2).                          bar diagrams are innovative which is useful for more focused
                                                                    and serious buyers". Others also liked the option of selecting
Hypothesis 2a and 2b. To test H2a, which refers to the as-          the importance of feature-values. Even though majority liked
pect of Explainability measured in terms of two sub-aspects         various functionalities of the Featuristic System, however, for
i.e. Transparency and Information Sufficiency, we applied           some participants, the interface was quite complex with lots
one-way repeated measure MANOVA (α = 0.05). The results             of information. One participant wrote that "There is a lot of
showed significant differences between two systems in terms         information for a novice". For some participants, the graphs
of the two aggregated variables (F(2, 54) = 5.59, p < .006,         were also difficult to understand.
Wilk’s λ = 0.826). Univariate test results further revealed that
for both Transparency and Information Sufficiency, the Featur-
istic system significantly performed better than the Filtering      Discussion.
system, indicating that the Featuristic leads to better explained   The results show that the Featuristic System significantly im-
recommendations. Therefore, we can accept H2a.                      proved the Preference Elicitation of users as compared to the
                                                                    Filtering System (H1). This might be due to the Featuristic
However, in terms of Comprehensibility (H2b) which is mea-          System’s ability, allowing users to not only select features and
sured in terms of two sub-aspects i.e. Understandability and        its values but also indicate the importance for each individual
Decision Support, we found no significant differences between       feature-value. This might have made the preference indication
the two systems (F(2, 53) = 1.93, p < .15, Wilk’s λ = 0.932).       for users more precise and efficient as compared to conven-
Therefore, the Hypothesis 2b can not be accepted.                   tional CF and CB systems, where it is mostly assumed that
Hypothesis 3a and 3b. With respect to Direct Manipulation           all features are equally important to users. This can also be
                                                                    reflected in participants’ qualitative feedback. For example,
of Preferences (H3a), the result of one-way repeated measure
                                                                    one participant stated that "I like specifying how important a
ANOVA showed statistically significant difference between
                                                                    feature was and not only if I wanted it or not" and the other
the two systems, where the Featuristic system performed sig-
nificantly better than the filtering system as can be seen in       wrote "I like being able to select how important a feature is
Table 2. The result shows that the Featuristic system leads to      with the sliding bar".
more direct manipulation of user preferences, thus accepting        Additionally, we investigated the second main aspect of the
our hypothesis 3a.                                                  Featuristic System i.e. Explainable Recommendations which
                                                                    is further measured in terms of two sub-aspects: Explain-
On the other hand, w.r.t. User Control, we found no sig-
                                                                    ability (H2a) and Comprehensibility (H2b). With respect to
nificant difference between the two systems F(1, 54) = 0.00,
                                                                    Explainability (H2a), the Featuristic System is perceived sig-
p < 1.00, Wilk’s λ = 1.00, where surprisingly, both systems
were perceived equally in terms of User Control. Therefore,         nificantly better than the Filtering System. This indicates,
the hypothesis 3b can not be accepted.                              that the more advanced explanations in the Featuristic System
                                                                    made the recommendations more transparent and explainable
Hypothesis 4. To evaluate the systems with respect to the           for users which can be validated from the participants’ qual-
User Experience, we analyzed the different sub-scales of the        itative feedback. One participant indicated that "It gave you
UEQ, where we found no significant differences between the          the information and segregation of data in an easy to read
                     Figure 3: (A) Filtering System, (B) Featuristic System without interactive explanations.

(graphical) format.", where the other stated that "You can see       users in the Featuristic and hence, not being perceived better
at a glance whether or not a specific camera has these fea-          by users compare to the Filtering System.
tures". For others it was useful to compare their choices with
other users, where one participant wrote "I love the fact that I     Additionally, for User Experience (H4), we found no signifi-
had to compare my choices with recommendations of others".           cant differences between two systems. This might indicates
Another participant wrote "I like the fact that the system brings    that regardless of more advanced explanations with interactive
other users’ choice for me and also gave me detailed informa-        mechanisms provided in Featuristic compared to the Filtering
                                                                     System with much simpler explanations, participants perceived
tion about my search. Additionally, we found no significant
                                                                     both systems to be of similar quality in terms of the user expe-
differences between two systems in terms of Comprehensibil-
                                                                     rience. On the other hand, this might also be explained under
ity (H2b) for aggregated variables, where the Filtering System
                                                                     the assumption that participants are different in terms of the
showed slightly better results. This might be explained due
to the fact that the two systems were quite different in terms       domain knowledge and their ability to perceive and understand
of the functionalities and level of information provided. On         the system provided information and functionalities – as for
one hand, the Filtering System provides rather simple and non-       some participants it might be easier to understand the informa-
interactive explanations and on the other hand, the Featuristic      tion and its functionalities and for others too complicated. As
System is more complex in terms of interactive functionalities       stated by one of the participants about the Featuristic System
and advanced explanations that it provided. Thus, making             that "The information was easy to understand for me, but I
the Filtering System being perceived more comprehensible by          can imagine less technical people would find information and
                                                                     graphics confusing."
users. This is also depicted in participants’ qualitative feed-
back about the Filtering System, where they found the system         Follow-up User Study
much simpler, clean, and easy to understand as compare to the        To verify, that integrating the developed explanation method
Featuristic System. For some of the participants, the Featur-        with interaction tools have positive impact on user-oriented
istic System provided too much information which is rather           aspects, which is independent of the types of underlying algo-
complex for them to comprehend.                                      rithms – we conducted a follow-up user study. In this study,
With respect to Direct Manipulation of Preferences (H3a), the        we isolated the underlying algorithm by focusing only on the
Featuristic System is perceived significantly better than the        type of explanations provided. For this, we compared two
Filtering System, suggesting that integrating the interactive        versions of the Featuristic System that apply same underlying
mechanisms with our explanations allowed users to directly           hybrid approach. The only difference is in terms of interactive
manipulate their preferences through these explanations. Sur-        and non-interactive explanations provided by the systems.
prisingly, in terms of User Control (H3b), both systems are          Method
perceived of equal quality. As in the Filtering System, the          The study was conducted via Prolific in a within-subject design
only way provided to users to control the system’s output is         and follows the same procedure and design as the first study
by selecting features or re-adjusting the feature-values. And it     described in section "Featuristic: Prototype and Interaction
has been shown, that such user control mechanisms are easy           possibilities". We again tested the same hypotheses described
to use compared to mechanisms that allow users to indicate           in section "Hypotheses:", but this time, isolating the type of
the relative preferences [14] (e.g., feature-rating slider in Fea-   recommendation as the independent variable. We created two
turistic). In such cases, it is sometimes not clear to users if      versions of the Featuristic System, described below:
having the slider in the middle position has same meaning as
having the slider at the maximum level. This might have made         • Featuristic System: The interface design and interaction
the interpretation of such control mechanisms complicated for          its possibilities are described in "User Study 1" and shown
                                                                       in Figure 1.
• Featuristic System without interactive explanations:              both systems being perceived equally in terms of Comprehen-
  The prototype is similar to the one shown in the Figure           sibility and User Experience. However, qualitative feedback
  1. The only major difference is that the user is not provided     showed that most of the participants like the interactive func-
  with the functionality to modify or critique their selected       tionality of the Featuristic System. One participant stated that
  feature-value or rating through graphical explanations of         " In my opinion, this system is more clear and clean than the
  recommendation (See Figure 3B).                                   other one. Although they look almost the same, I feel this one
                                                                    can be a bit more efficient. It is very helpful and intuitive".
Participants and Questionnaire. A total of 37 Prolific users
were recruited online (15 females) with age ranging from 18-        CONCLUSION AND OUTLOOK
50 years (M = 24.86, SD = 6.9). The study completion time
                                                                    In this paper, we showcased the possibility of integrating our
was recorded approximately 15-20 minutes. To address our
                                                                    proposed feature-based CF style explanations with interaction
hypotheses, we used the same questionnaire items as in the
                                                                    tools, through a prototype system called Featuristic. To study
first user study.
                                                                    the impact from a user perspective in terms of Preference
                                                                    Elicitation, Explainable Recommendations, Preference Manip-
Results                                                             ulation, and User Experience, we first compared our Featur-
To compare our two versions of Featuristic system, we ap-           istic System with the Conventional Filtering System that only
plied one-way repeated measure MANOVA and the results               provides simple and non-interactive explanations.The results
can be seen in Table 2. With respect to Preference Elicitation      showed that the Featuristic System is significantly perceived
(H1), the results showed significant difference, where the Non-     better than the Conventional Filtering System with respect
Interactive version of the system is perceived significantly        to the aspects of Preference Elicitation, Explainability, and
better than the Interactive version of the system. Therefore,       Preference Manipulation. However, we found no significant
we have to reject our H1.                                           differences between the two systems in terms of the User Ex-
For Explainability of recommendations (H2a) which is                perience and Comprehensibility, which might be due to the
measured in terms of Transparency and Information Suffi-            complex structure of explanations and the system design, as
ciency, we found significant differences between two systems        stated by many participants in their qualitative feedback.
F(2, 35) = 16.30, p < .001, Wilk’s λ = 0.518, for aggregated        We further conducted a follow-up user study to verify, that
variables. However, the result of univariate test showed sig-       the results from the first study are independent of the under-
nificant difference only in terms of Information Sufficiency,       lying algorithms. For this, we compared two versions of the
where the Interactive Featuristic performed better. Overall,        Featuristic System, by isolating the types of underlying algo-
we can accept our H2a.                                              rithms and only focusing on the type of explanations provided
Regarding Comprehensibility (H2b), which is measured in             i.e., Interactive and Non-interactive explanations. The results
terms of Understandability and Decision Support, we found           showed that the Interactive version of Featuristic performed
no significant differences between two systems. Overall, we         significantly better than the non-interactive version in terms
can not accept our H2b.                                             of Explainability, User Control, and Direct Manipulation.
Additionally, in terms of Direct Manipulation and User Con-         To summarize, the current work clearly showed the positive
trol, we again found significant differences between two sys-       impact of integrating advanced explanations with interaction
tems, where the Interactive Featuristic performed significantly     tools to improve the user-oriented aspects, especially in com-
better than the Non-interactive system. Therefore, we can ac-       plex product domains. However, the current work has some
cept H3a and H3b. However, with respect to UEQ, we found            limitation in terms of the complex system design which could
no significant differences between the two systems, which           further be simplified for improving the overall User Experi-
leads to rejecting the H4.                                          ence. Additionally, factors like user’s cognitive effort and user
                                                                    experience with the product domain, might also impact the
Discussion                                                          user perception of the system with respect to user-oriented
The results of the follow-up study showed, that in terms of         aspects, and thus requires further investigation in future work.
Explainability, User Control, and Direct Manipulation, the
Interactive version of Featuristic performed significantly bet-
ter than the Non-interactive version. This clearly shows the
positive impact of integrating interactive mechanisms with
explanations, on these aspects. The results are similar to re-
sults of the first user study for most of the factors, where the
Interactive Featuristic performed better. This verifies, that
our advanced explanations showed positive impact on user-
oriented aspects, independent of the underlying algorithms.
The insignificant differences in terms of Comprehensibility
and User Experience, might be due to the fact that both sys-
tems provided same functionalities and level of explanations.
The only difference is with respect to the interactivity and non-
interactivity of explanations. This might explain the reason for
REFERENCES                                                       [13] Jonathan L Herlocker, Joseph A Konstan, and John
 [1] Ivana Andjelkovic, Denis Parra, and John O’Donovan.              Riedl. 2000. Explaining collaborative filtering
     2016. Moodplay: Interactive mood-based music                     recommendations. In Proceedings of the 2000 ACM
     discovery and recommendation. In Proceedings of the              conference on Computer supported cooperative work.
     2016 Conference on User Modeling Adaptation and                  ACM, 241–250.
     Personalization. ACM, 275–279.
                                                                 [14] Dietmar Jannach, Sidra Naveed, and Michael Jugovac.
 [2] Manuel J Barranco and Luis Martínez. 2010. A method              2016. User control in recommender systems: Overview
     for weighting multi-valued features in content-based             and interaction challenges. In International Conference
     filtering. In International conference on industrial,            on Electronic Commerce and Web Technologies.
     engineering and other applications of applied intelligent        Springer, 21–33.
     systems. Springer, 409–418.
                                                                 [15] Yucheng Jin, Karsten Seipp, Erik Duval, and Katrien
 [3] James Bennett, Stan Lanning, and others. 2007. The               Verbert. 2016. Go with the flow: effects of transparency
     netflix prize. In Proceedings of KDD cup and workshop,           and user control on targeted advertising using flow
     Vol. 2007. New York, 35.                                         charts. In Proceedings of the International Working
                                                                      Conference on Advanced Visual Interfaces. ACM,
 [4] Mustafa Bilgic and Raymond J Mooney. 2005.                       68–75.
     Explaining recommendations: Satisfaction vs.
     promotion. In Beyond Personalization Workshop, IUI,         [16] Michael Jugovac and Dietmar Jannach. 2017.
     Vol. 5. 153.                                                     Interacting with recommenders—overview and research
                                                                      directions. ACM Transactions on Interactive Intelligent
 [5] Daniel Billsus and Michael J Pazzani. 1999. A personal           Systems (TiiS) 7, 3 (2017), 10.
     news agent that talks, learns and explains. In
     Proceedings of the third annual conference on               [17] Antti Kangasrääsiö, Dorota Glowacka, and Samuel
     Autonomous Agents. Citeseer, 268–275.                            Kaski. 2015. Improving controllability and predictability
                                                                      of interactive recommendation interfaces for exploratory
 [6] Svetlin Bostandjiev, John O’Donovan, and Tobias                  search. In Proceedings of the 20th international
     Höllerer. 2012. TasteWeights: a visual interactive hybrid        conference on intelligent user interfaces. ACM,
     recommender system. In Proceedings of the sixth ACM              247–251.
     conference on Recommender systems. ACM, 35–42.
                                                                 [18] Joseph A Konstan and John Riedl. 2012. Recommender
 [7] Robin Burke. 2007. Hybrid web recommender systems.               systems: from algorithms to user experience. User
     In The adaptive web. Springer, 377–408.                          modeling and user-adapted interaction 22, 1-2 (2012),
 [8] Jorge Castro, Rosa M. Rodriguez, and Manuel J.                   101–123.
     Barranco. 2014. Weighting of features in content-based      [19] Greg Linden, Brent Smith, and Jeremy York. 2003.
     filtering with entropy and dependence measures.                  Amazon. com recommendations: Item-to-item
     International journal of computational intelligence              collaborative filtering. IEEE Internet computing 7, 1
     systems 7, 1 (2014), 80–89.                                      (2003), 76–80.
 [9] Shuo Chang, F Maxwell Harper, and Loren Gilbert             [20] Benedikt Loepp, Tim Donkers, Timm Kleemann, and
     Terveen. 2016. Crowd-based personalized natural                  Jürgen Ziegler. 2019. Interactive recommending with
     language explanations for recommendations. In                    tag-enhanced matrix factorization (TagMF).
     Proceedings of the 10th ACM Conference on                        International Journal of Human-Computer Studies 121
     Recommender Systems. ACM, 175–182.                               (2019), 21–41.
[10] Emden Gansner, Yifan Hu, Stephen Kobourov, and              [21] Benedikt Loepp, Katja Herrmanny, and Jürgen Ziegler.
     Chris Volinsky. 2009. Putting recommendations on the             2015. Blended recommending: Integrating interactive
     map: visualizing clusters and relations. In Proceedings          information filtering and algorithmic recommender
     of the third ACM conference on Recommender systems.              techniques. In Proceedings of the 33rd Annual ACM
     ACM, 345–348.                                                    Conference on Human Factors in Computing Systems.
                                                                      ACM, 975–984.
[11] Brynjar Gretarsson, John O’Donovan, Svetlin
     Bostandjiev, Christopher Hall, and Tobias Höllerer.         [22] Martijn Millecamp, Sidra Naveed, Katrien Verbert, and
     2010. Smallworlds: visualizing social recommendations.           Jürgen Ziegler. 2019. To Explain or Not to Explain: the
     In Computer Graphics Forum, Vol. 29. Wiley Online                Effects of Personal Characteristics When Explaining
     Library, 833–842.                                                Feature-based Recommendations in Different Domains.
                                                                      In CEUR workshop proceedings. CEUR.
[12] Xiangnan He, Tao Chen, Min-Yen Kan, and Xiao Chen.
     2015. Trirank: Review-aware explainable                     [23] Afshin Moin. 2014. A unified approach to collaborative
     recommendation by modeling aspects. In Proceedings of            data visualization. In Proceedings of the 29th Annual
     the 24th ACM International on Conference on                      ACM Symposium on Applied Computing. ACM,
     Information and Knowledge Management. 1661–1670.                 280–286.
[24] Khalil Muhammad, Aonghus Lawlor, and Barry Smyth.            [37] Panagiotis Symeonidis, Alexandros Nanopoulos, and
     2016. On the use of opinionated explanations to rank              Yannis Manolopoulos. 2008. Providing justifications in
     and justify recommendations. In The Twenty-Ninth                  recommender systems. IEEE Transactions on Systems,
     International Flairs Conference.                                  Man, and Cybernetics-Part A: Systems and Humans 38,
                                                                       6 (2008), 1262–1272.
[25] Sayooran Nagulendra and Julita Vassileva. 2014.
     Understanding and controlling the filter bubble through      [38] Panagiotis Symeonidis, Alexandros Nanopoulos, and
     interactive visualization: a user study. In Proceedings of        Yannis Manolopoulos. 2009. MoviExplain: a
     the 25th ACM conference on Hypertext and social media.            recommender system with explanations. RecSys 9
     107–115.                                                          (2009), 317–320.
[26] Sidra Naveed and Jürgen Ziegler. 2019. Feature-Driven        [39] Nava Tintarev and Judith Masthoff. 2007. Effective
     Interactive Recommendations and Explanations with                 explanations of recommendations: user-centered design.
     Collaborative Filtering Approach.. In ComplexRec@                 In Proceedings of the 2007 ACM conference on
     RecSys. 10–15.                                                    Recommender systems. ACM, 153–156.
[27] John O’Donovan, Barry Smyth, Brynjar Gretarsson,             [40] Nava Tintarev and Judith Masthoff. 2012. Evaluating the
     Svetlin Bostandjiev, and Tobias Höllerer. 2008.                   effectiveness of explanations for recommender systems.
     PeerChooser: visual interactive recommendation. In                User Modeling and User-Adapted Interaction 22, 4-5
     Proceedings of the SIGCHI Conference on Human                     (2012), 399–439.
     Factors in Computing Systems. ACM, 1085–1088.                [41] Nava Tintarev and Judith Masthoff. 2015. Explaining
[28] Eli Pariser. 2011. The filter bubble: What the Internet is        recommendations: Design and evaluation. In
     hiding from you. Penguin UK.                                      Recommender systems handbook. Springer, 353–382.
[29] Denis Parra and Peter Brusilovsky. 2015.                     [42] Chun-Hua Tsai and Peter Brusilovsky. 2017. Providing
     User-controllable personalization: A case study with              Control and Transparency in a Social Recommender
     SetFusion. International Journal of Human-Computer                System for Academic Conferences. In Proceedings of
     Studies 78 (2015), 43–67.                                         the 25th Conference on User Modeling, Adaptation and
                                                                       Personalization. ACM, 313–317.
[30] Denis Parra, Peter Brusilovsky, and Christoph Trattner.
     2014. See what you want to see: visual user-driven           [43] Chun-Hua Tsai and Peter Brusilovsky. 2018. Beyond the
     approach for hybrid recommendation. In Proceedings of             ranked list: User-driven exploration and diversification
     the 19th international conference on Intelligent User             of social recommendation. In 23rd International
     Interfaces. ACM, 235–240.                                         Conference on Intelligent User Interfaces. ACM,
                                                                       239–250.
[31] János Podani. 1999. Extending Gower’s general
     coefficient of similarity to ordinal characters. Taxon 48,   [44] Katrien Verbert, Denis Parra, and Peter Brusilovsky.
     2 (1999), 331–340.                                                2014. The effect of different set-based visualizations on
                                                                       user exploration of recommendations. In CEUR
[32] Pearl Pu, Li Chen, and Rong Hu. 2011. A user-centric              Workshop Proceedings, Vol. 1253. University of
     evaluation framework for recommender systems. In                  Pittsburgh, 37–44.
     Proceedings of the fifth ACM conference on
     Recommender systems. ACM, 157–164.                           [45] Katrien Verbert, Denis Parra, Peter Brusilovsky, and
                                                                       Erik Duval. 2013. Visualizing recommendations to
[33] Pearl Pu, Li Chen, and Rong Hu. 2012. Evaluating                  support exploration, transparency and controllability. In
     recommender systems from the user’s perspective:                  Proceedings of the 2013 international conference on
     survey of the state of the art. User Modeling and                 Intelligent user interfaces. ACM, 351–362.
     User-Adapted Interaction 22, 4-5 (2012), 317–355.
                                                                  [46] Jesse Vig, Shilad Sen, and John Riedl. 2009.
[34] Amit Sharma and Dan Cosley. 2013. Do social                       Tagsplanations: explaining recommendations using tags.
     explanations work?: studying and modeling the effects             In Proceedings of the 14th international conference on
     of social explanations in recommender systems. In                 Intelligent user interfaces. ACM, 47–56.
     Proceedings of the 22nd international conference on
     World Wide Web. ACM, 1133–1144.                              [47] Bo Xiao and Izak Benbasat. 2007. E-commerce product
                                                                       recommendation agents: use, characteristics, and impact.
[35] Rashmi Sinha and Kirsten Swearingen. 2002. The role               MIS quarterly 31, 1 (2007), 137–209.
     of transparency in recommender systems. In CHI’02
     extended abstracts on Human factors in computing             [48] Kai Zeng, Kun She, and Xinzheng Niu. 2014. Feature
     systems. ACM, 830–831.                                            selection with neighborhood entropy-based cooperative
                                                                       game theory. Computational intelligence and
[36] Kirsten Swearingen and Rashmi Sinha. 2001. Beyond                 neuroscience 2014 (2014), 11.
     algorithms: An HCI perspective on recommender
     systems. In ACM SIGIR 2001 Workshop on
     Recommender Systems, Vol. 13. Citeseer, 1–11.