Causality based explanations in multi-stakeholder recommendations Willème Verdeaux, Clément Moreau Nicolas Labroche, Patrick Marcel University of Tours, France University of Tours, France firstname.lastname@etu.univ-tours.fr firstname.lastname@univ-tours.fr ABSTRACT 2 RELATED WORK This paper introduces two novel contributions in the context 2.1 Multi-stakeholder recommender systems of multi-stakeholder recommender system. First, we present As stated by [1] “multi-stakeholder recommendation has emerged a simple and intuitive method for multi-objective and multi- as a unifying framework for describing and understanding rec- stakeholder recommendation that relies on preferences for each ommendation settings where the end user is not the sole focus”. stakeholder. The second contribution is the adaptation of Halpern Practically, a multi-stakeholder recommender produces an ag- and Pearl’s definition of causality to introduce definitions of gregated recommendation that maximizes the utility for several , causes and explanations for multi-stakeholder recommender sys- with possibly conflicting objectives, stakeholders at once. Burke tems. Experiments conducted on real data study the tractability et al. [9] describe the utility space to explore as a tuple where each of such explanation approach in the context of RS. tuple’s component is the utility of a stakeholder computed as the sum of items utility for this stakeholder. Other works consider more complex models with multiple objectives per stakeholder [15]. 1 INTRODUCTION This improvement of the traditional recommendation problem Recommender systems solve the problem of predicting the value allows to model different stakeholders’ types such as consumer, of a user-item relationship to propose the most appropriate list of providers or system [2, 15] and to take into account new evalua- items to each user based on personalization and contextualization tion metrics such as novelty or fairness among stakeholders (to mechanisms [3, 19]. Traditional approaches like content-based, counter the popularity bias as in [9]). Several approaches have KNN systems or Latent Factor Models can generate recommen- been proposed in the recent years [1–3] and [1] introduces a first dations for which simple, limited explanations can be derived effort to structure this novel domain by proposing a classifica- (e.g., "other customer also bought that item"). However, modern, tion of multi-stakeholder systems depending on: (1) the types of multi-objective, multi-stakeholder recommender systems [18, 21] stakeholders involved, (2) if they are active or passive or (3) if the make the generation of explanations more challenging. recommendation is neutral or personalized for each stakeholder. In this paper, we introduce two novel contributions in this As recalled in [15], and similarly to business rules, multi-stake- regard. First, we introduce a simple and intuitive method for holders approaches [9, 18] are most of the time a posteriori meth- multi-objective and multi-stakeholder recommendation that re- ods applied on top of existing traditional recommender systems. lies on preferences of each stakeholder. Following previous works For example, [18] proposes an approach that inputs a user × on multi-stakeholder recommenders [1, 9, 15], we express the item matrix as provided by a traditional recommender system, multi-stakeholder recommendation problem with an additive and then finds an optimal binary assignment for items by providers gain-based formulation. We compute a profile for each stake- to enforce constraints related to providers and enhance the distri- holder, and define the gain brought by the recommendation us- butions of recommendations across retailers. Similarly to these ing the classical NDCG measure, to confront the profile with the approaches, in our proposal, we also rely on a traditional recom- ranks of the recommended items. mender system to compute a profile for each stakeholder. The second contribution is the adaptation of Halpern and Finally, in [15] the adaptation to multiple-objective multi- Pearl’s definition of causality [12] and more precisely, its adapta- stakeholder is done as a post-processing step after a first ranking tion in the context of databases as developed in [14], to introduce of items is already obtained for each consumer. The proposed ap- definitions of causes and explanations for multi-stakeholder rec- proach aims to re-rank it such that the new ranking remains close ommender systems. This so-defined framework allows to explain to the initial one and optimizes the commission perceived by the both the recommendations given but also each individual stake- "system" stakeholder. Interestingly, the proposed approach com- holder profile, in terms of the items rated by this stakeholder. putes a ranking of items instead of predicting scores, to produce The outline of the paper is as follows. The next section presents the recommendations. To combine all conflicting stakeholders related work and Section 3 motivates our approach with an ex- objectives, [15] introduces a new learn to (re-)rank optimization ample. We present our first contribution, the multi-stakeholder approach based on the kernel version of the Kendall tau cor- recommender, in section 4. Section 5 introduces our second con- relation metric. A similar approach that learns to rank is also tribution, namely explanation generation. Section 6 presents used in [21] to determine ratings that would be unknown to the preliminary tests and Section 7 concludes by discussing future multi-stakeholder recommender. work. In our proposal, we consider a more straightforward rank- based approach that involves Normalized Discounted Cumula- tive Gain to measure to which extent the ranking preferences © 2020 Copyright for this paper by its author(s). Published in the Workshop Proceed- of a stakeholder are respected. The solution space is explored ings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020, Copenhagen, with a simple simulated annealing. Similarly to [21], we use a Denmark) on CEUR-WS.org. Use permitted under Creative Commons License At- recommender system to predict user item scores that are missing tribution 4.0 International (CC BY 4.0) in the data to construct a complete ranking over the set of items, as our stakeholder profile. 2.2 Explanations Explaining recommendations. Explainable recommendations refers to personalized recommendation algorithms that “not only provide the user with recommendations, but also make the user aware why such items are recommended” [20]. Gedikli et al. [10] evaluate different explanation types and propose a set of guidelines for designing and selecting suitable explanations for recommender systems. Various forms of explanations have been explored. For in- stance, in [7], the authors introduce the problem of meaningful rating interpretation (MRI) in the context of recommendation systems and as such provide two main heuristics for explaining Figure 1: Overview of a user’s profile explanation collaborative ratings. Explanations are here provided as a subset of reviewers presented to the user and described by metadata We follow a similar approach in this paper, partitioning attributes shared among all reviewers in the group. Groups are the set of events in exogenous and endogenous events. build under constraints to minimize a description error while • Contingencies: a piece of information (tuple in [14]) t maximizing the coverage of ratings. Similarly, meaningful differ- is a cause for the observed outcome only if there is a ences are provided by determining the most unbalanced subset hypothetical setting of the other endogenous variables of reviewers in terms of positive / negative ratings and are meant under which the addition/removal of t causes the observed to highlight controversial items, i.e. items for which groups of outcome to change. Therefore, in order to check that t is users consistently disagree. a cause of an outcome based on a given dataset, one has Interestingly, it turns out that so far, to our knowledge, ex- to find a data set (called contingency) to remove from (or planations based on the notion of causality have not attracted add to) the data set, such that t immediately affects the attention in the recommender system community. outcome in the new state of the data set. In theory, in order Explanation and causality. Causality has been studied and to compute the contingency one has to iterate over the defined algorithmically by Judea Pearl, in a highly influential subsets of the data set, meaning that checking causality is work primarily using the concepts of counterfactuals and degree NP-complete in general [8]. In the context of this paper, the of responsibility [16]. considered outcome is a set of recommended items and the At the core, causality is defined in terms of intervention: an dataset is the set of ratings used by the recommendation input is said to be a cause if we can affect the output by changing algorithm. just the value of that input, while keeping all others unchanged. • Responsibility measures the degree of causality as a func- Importantly, causality can only be established in a controlled tion of the size of the smallest contingency set. In ap- physical experiment, and cannot be derived solely from data, plications involving large datasets, it is critical to rank while an explanation only requires that a change in the input the candidate causes by their responsibility, because the affects the output: the more it affects the output, the better the outcome may have large lineages and large numbers of explanation [16]. candidate causes. In theory, in order to compute the re- Recently, many authors in the database community (e.g., [14, sponsibility one has to iterate over all contingency sets 17]) adapted these notions to explain the results of database meaning that computing responsibility in general is hard queries. In this context, a major challenge in finding explanations [5]. is the difficulty of defining and computing in real time interven- tions for complex datasets and queries. Noticeably, in both [7] 3 MOTIVATING EXAMPLE and [17] data cube operations are used to compute explanations, hence benefiting from built-in database optimizations. We leave i1 i2 i3 i4 i5 as future work an in-depth study of optimization mechanisms u1 1 2 2 5 6 for computing explanations in our context of multi-stakeholder u 2 11 3 7 recommendations. u3 2 5 The approach we present in this paper, depicted in Figure 1, Table 1: Toy example of user purchases is particularly inspired by that of Meliou et al. [14], who bor- rowed from Pearl and Halpern’s theory of causality the following important concepts: Figure 1 gives the intuition of our approach. Assume a set of • Partitioning of variables into exogenous and endogenous: events relating users to items (e.g., purchases), expressing implicit exogenous variables define a context determined by ex- user preferences. From this set, a profile σu1 can be inferred for ternal, unconcerned factors, deemed not to be possible a given user u 1 , e.g., by predicting scores for unseen items. To causes, while endogenous variables are the ones judged explain this profile to u 1 , one could look for events that are the to affect the outcome and are thus potential causes. In most responsible of it. To do so, we compute interventions on the their work, Meliou et al. propose to partition a database set of events, by removing some events from it. Each intervention instance into exogenous and endogenous tuples [14], and leads to another profile σu′ 1 for u 1 . If the two profile σu1 and σu′ 1 look for explanations among the endogenous tuples only. are different, this means that the removed events have, to a certain degree, a responsibility in σu1 . As there can be many such events, event and measuring the responsibility as a function of the num- instead of presenting them as explanations to u 1 , we summarize ber of events to remove (called contingency set) before the non them using pattern extraction, and present the extracted patterns. counterfactual cause becomes a counterfactual cause. For in- We next illustrate this approach with a detailed example. stance, removing both events in entry (u 1, i 3 ) changes σu1 , mean- Consider Table 1, that details consumption events of three ing that both events in that entry have a responsibility in the users. An event corresponds to a quantity purchased, and a user profile, but with a lesser degree than that of (u 2, i 5 ) (which does can purchase different quantities of the same item at different not need another event to be removed to be counterfactual). moments. For instance, user u 1 purchased initially 5 items i 3 Explaining a profile σui means presenting to ui the events and then another 6 items i 3 subsequently. Assume we have two most responsible for their profile. Consider for instance user u 2 . categories of stakeholder: that of users ui and that of a provider The events most responsible for u 2 ’s profile are the counterfactual p. Each purchased item i j is described by two properties, price causes (u 2, i 1 ), (u 2, i 5 ) and (u 3, i 2 ). and popularity (see Table 2. However, since the number of causes can be large, presenting the events themselves may not be user-friendly. A better way is i1 i2 i3 i4 i5 to summarize them using patterns extracted from the properties price med med med high high of the items concerned by the events. Indeed, consistently with popularity high med high high high earlier literature on deriving explanations using causality theory Table 2: Item description for the toy example [17], we propose to present explanations under the form of pred- icates P = v where P is property and v is a value. In our example, suppose we use frequent itemset mining [11] to extract proper- ties appearing more than two third of the time in items i 1, i 2, i 5 . Stakeholder preferences can be modeled following a strategy Two frequent itemsets can be extracted from these counterfactual particular to each stakeholder category, and computed using the events’ properties: (i) price=med and (ii) popularity=high. set of events. For instance, users’ preferences should be such that most purchased items are preferred. The provider’s preferences could be that the same number of each item is purchased. For 4 MULTI-STAKEHOLDER instance, according to Table 1, the preferences of user u 1 are that RECOMMENDATION i 3 is preferred over i 2 that in turns is preferred over i 1 , that in turn 4.1 Stakeholders and their profiles is preferred over both i 4 and i 5 . The preferences of stakeholder p We formalize the problem as follows. We assume a set O of n is that i 4 is preferred (since it sold less) over i 2 , in turns preferred objects, where each object is described by a list of properties. Let over i 5 , in turns preferred over i 1 and then over i 3 . F be a set of property names and V be a set of property values, Assume now that we have a way of constructing a user profile function properties : O × F → V denotes the values of a given from each user’s preferences so that the profile corresponds to a property for a given object. We assume a set S of stakeholders total order of the items, consistent with that user’s preferences. and a set R of stakeholder roles, like : "user", "provider", "system For instance, a simple baseline approach, where prediction is the owner", etc. The role of a stakeholder is given by a surjective sum of consumptions by item, would allow to deduce that the function role : S → R. Finally, we have a set E of events, where profile σu1 of u 1 ranks all items as follows: σu1 = [i 3, i 2, i 1, i 5, i 4 ]. an event is a 4-tuple ⟨s, o, r, t⟩ with s a stakeholder in S, o an Now, assume that we want to issue a recommendation achiev- object in O, a numerical value r ∈ R, representing e.g., a rating ing a compromise between all stakeholders. A multi-stakeholder or an amount, and t a timestamp. multi-objective recommendation algorithm can be used to com- pute this recommendation from the profiles. This recommenda- Preferences. Each stakeholder s is associated with their prefer- tion can take the form of a total ranking of the items, proposed ences, expressed as a weak order (a partial order that is nega- to all stakeholders, noted σ∗ . In our example, suppose this rec- tively transitive) over the set O, noted ⪯s . This preference relation ommendation is σ∗ = [i 4, i 3, i 1, i 2, i 5 ]. induces a partition of the set O. As stakeholders’ profiles and recommendation σ∗ have the same form, both can be explained in the same way. We illus- trate here how profiles can be explained. Suppose we would like Example 4.1. Consider the example of the previous section. to explain σu1 , considered as the outcome of the process that The preferences of u 1 can be deduced from Table 1. They corre- computes the profile. We then intervene on the set of events by spond to the following weak order over the set of consumption removing those of the events responsible for the current profile. events i 3 ⪯u1 i 2 ⪯u1 i 1 ⪯u1 {i 4, i 5 }. For instance, we can see that removing any of the events in en- tries (u 1, i 1 ) (u 1, i 2 ) or (u 1, i 3 ) would leave σu1 unchanged, while Profile. The profile of a stakeholder s ∈ S consists of a total removing the event in entry (u 2, i 5 ) would alter σu1 . Therefore, order σs over the objects of O that is consistent with the pref- the event in entry (u 2, i 5 ) can be said to be a direct cause, or erence relation ⪯s , i.e., σs is obtained by composing ⪯s with counterfactual cause, for σu1 . Note that both purchases made a total order ≺ using priority composition [6], where the total by u 1 and not made by u 1 can be counterfactual causes for u 1 ’s order ≺ is expressed in a quantitative way through function profile. Indeed, event (u 3, i 4 ) is not a counterfactual cause. utility : S × O → R. Note also that events pertaining to preferences can also be These preferences can be noted by a vector ⟨utility(s, o 1 ), . . . , counterfactual causes. Consider indeed user u 2 , whose profile is utility(s, on )⟩, where the oi ’s are the objects of O, and utility’s σu2 = [i 1, i 5, i 3, i 2, i 4 ]. It can be seen that events (u 2, i 1 ), (u 2, i 5 ) are pairwise different, or alternatively by a permutation σs of the are counterfactual causes for σu2 , as is (u 3, i 2 ). objects in O. We note P the set of permutations of size n. Let a Understanding the responsibility of non counterfactual causes permutation σ , we note σs (o) = i the rank i given by s to object in the profile can be done by trying to remove more than one o, and σs [i] = o the object ranked i by s. Example 4.2. Assume that it is predicted, based on u 1 ’s prefer- Actual cause. Given a stakeholder s with preference relation ences, that u 1 is more likely to prefer i 5 to i 4 . The profile of u 1 ⪯s and profile σs , and a set of events E, an event e = ⟨s, o, r, t⟩ of therefore consists of the total order σu1 = [i 3, i 2, i 1, i 5, i 4 ]. E is an actual cause in E for σs if there exists a set Γ ⊆ E called contingency for e, such that e is a counterfactual cause for σs in 4.2 Recommendation objective E − Γ. An event is an actual cause if one can find a contingency Our recommendation objective is to find the optimal permuta- under which it becomes a counterfactual cause. tion σ∗ in the sense of a compromise between a subset S of all Responsibility. If an event e is a cause, the responsibility of e stakeholders of S. Given a subset S ⊆ S the recommendation σ∗ 1 for σs is ρ e = 1+min for S is given by function R : 2S → P: |Γ | where Γ ranges over all the contingency Õ sets for e. R(S) = σ∗ = argmax α s Q(σ, s) (1) The responsibility is a function of the minimal number of σ ∈P s ∈S events to remove from E before the event becomes counterfactual. where α s is the weight of the stakeholder s such that s ∈S α s = 1 Causality problem. Compute the set C ⊆ E of actual causes Í + and Q : P × S → R is a quality function measuring how well a for σs . permutation σ is for stakeholder s. In this work, we have chosen the classical NDCG to measure Responsibility problem. For each actual cause e ∈ C, compute the quality of the solution. In other words, the quality is measured its responsibility ρ e . in terms of the gain brought by σ∗ for the stakeholder s with Explanation. Given a set C of causes for σs , consider the set respect to their preference. Equation 1 becomes: OC = {o ∈ O |∃e = ⟨s, o, r, t⟩ ∈ C}. An explanation for σs is a Õ DCG(s, σ ) set of frequent patterns extracted from {⟨properties(o, f 1 ), . . . , σ∗ = argmax αs (2) σ ∈P IDCG(s) properties(o, f |F | )⟩|o ∈ OC , fi ∈ F }. s ∈S where: n 5.2 Explanations for σ∗ Õ utility(s, σ [r ]) In this case, we look for what causes the differences between σ∗ DCG(s, σ ) = (3) r =1 log2 (1 + r ) and σs , in the profiles (and then preferences) of the stakeholders other than s. Intuitively, we want to explain to stakeholder s why IDCG(s) = DCG(s, σs ) (4) is the compromise σ∗ not more favorable to them. Precisely, an This formulation makes the recommendation problem an in- event e is a counterfactual cause for σ∗ if we remove e from E then stance of the well-studied rank aggregation problem, aiming at σs does not change but there are some change in σs ′ for some aggregating the total order preferences of multiple agents [13]. stakeholder s , s ′ and σ∗ improves the score Q for s. Formally: Particularly, our problem is similar to the Kemeny optimal ag- • E |= σ∗ gregation problem that determines the best consensus ranking • for all stakeholders (including s) si ∈ S, E |= σsi minimizing the number of permutations as a Kendall-tau distance • E − {e} |= σs with the rankings of each agent. This problem is known to be • E − {e} ̸ |= σ∗ but E − {e} |= σ∗′ with σ∗′ , σ∗ , NP-Hard even when there are only few rankings to aggregate. • Q E−e (σ∗′ , s) > Q E (σ∗, s) Because of this complexity, stochastic or evolutionary algorithms are often proposed as an efficient way to explore the space of pos- Actual cause, responsibility and explanations are defined ac- sible rankings and then solve what is known as Mallows problem cordingly. [4]. In this work, we follow a similar idea and use simulated an- nealing to find the consensual ranking based on all stakeholders 6 WORK COUNCIL USE CASE rankings. The recommender system and the explanation framework are implemented in Python 3.6. User profiles are computed from 5 EXPLAINING PROFILES AND raw data using the Surprise Scikit1 . Simulated annealing for RECOMMENDATIONS exploring the space of permutations is an in-house development. Our framework is inspired by that of Meliou et al. for explaining We experimented on a real use case based, with purchase data query answers in relational databases [14]. from Kalidea-Up, a company that provides a service platform to works councils. We considered two stakeholder roles: "workers" who can purchase discounted services from works councils, and 5.1 Profile explanations "system" to represent Kalidea-Up system. Available purchase Inspired by Meliou et al [14], it makes sense to partition the set of data correspond to the 2014-2018 period, gathering over 5, 840 all events E into endogenous and exogenous events. However, in workers for 540 services and 168, 965 transactions, where each this preliminary work, and without loss of generality, we consider transaction corresponds to an event as defined in Section 4. all events as endogenous, and in what follows, only the set of all events E is used in the definitions. Let E be a set of (endogenous) 6.1 Testing the recommender system events and a stakeholder s with preference ⪯s , we note E |= σs if the profile σs is obtained using E and ⪯s . These purchases are used to build profiles for the workers, using Surprise’s off the shelve k-NN prediction algorithm, and baseline Counterfactual cause. Given a stakeholder s with preference predictions to ensure a total ranking of the services. The profile relation ⪯s and profile σs , and a set of events E, an event e = for the system is a ranking computed based on the subsidies ⟨s, o, r, t⟩ of E is a counterfactual cause in E for σs if E |= σs collected from the work councils for the services. As such, system and E − {e} |= σs′ with σs′ , σs . In other words, removing event e from E causes the profile of s to change. 1 http://surpriselib.com/ Uniform User System Global 2 preferences may not coincide with those of the users that may α NDCG user NDCG 0,79 System 0,81 NDCG0,79 Global prefer discounts on other types of services. 5 Uniform 0.79 ± 0.05 0.85 0,85 ± 0.03 0,86 0.82 ±0,890.01 Simulated annealing. The objective function optimised by the 80% system 0.63 ± 100.07 0.89 ± 0.04 0.76 ± 0.04 0,82 0,82 0,8 simulated annealing is provided in Equation 2. In order to crawl 80% user 0.95 ± 0.01 0.56 ± 0.04 0.75 ± 0.02 the space of possible permutations to determine the best σ∗ , each Table 3: Mean NDCG over 3 runs for the setting with 2 new candidate permutation is generated by a random mutation stakeholders:Uniform 1 recipient Number user ofand the system. “Uniform”, stakeholders 80% system that swaps 2 elements of the previous permutation at random. “80% system /NDCG user”user refer to the 2 0,79 settings 0,81 5 for 10 parameters 0,79 α NDCG user A temperature parameter allows to retain a solution that is not NDCG System NDCG Global 0,85 0,82 0,86 0,82 0,89 0,80 NDCG System NDCG Global optimal after a mutation. In order to force the convergence, a geometric progression with common ratio 0.996 is applied to the User System Global temperature. The maximum number of iterations for simulated 1 annealing was set to 500 in our experiment. 0,75 We consider the following settings in our experiments: Mean nDCG 0,5 • as in [18], we first observe to which extent the addition of new stakeholders decreases the quality of recommenda- 0,25 tion for each stakeholder: we vary the number of stake- 0 holders from 2 : 1 recipient and the system as represented Number of 2 5 10 by Kalidea-Up, to 10 : 9 recipients and the system ; stakeholders • for each of the previous settings, we vary the distribution of weights α s of Equation 2, for each stakeholder s ∈ S. Figure 2: Average “Mean NDCG scores” for all users, and Settings for α s are: either a uniform distribution, meaning reported for the system and globally for 3 distinct settings that all stakeholders have the same weight in the final de- with 2, 5 and 10 stakeholders. cision, or a 80% weight given to all workers, meaning that the system only accounts for 20%, or finally the symmetric to be simple, fast and yet quite accurate. Our experiments aim to situation where 80% of the weight is set to the system ; answer 2 main questions: For these settings, we expect the following results: (1) is the approach presented in Section 5 feasible in the con- • if our approach performs well, it should be able to produce text of multi-stakeholder recommendation algorithms? In a good quality consensus aggregation of all stakeholders’ this case, there are possibly 2E contingency sets to evalu- preferences. In this case, the overall quality, as measured ate to determine the responsibility of each event of E. For by our objective function should be as close to 1 as possible. this reason, we propose to first limit our experiments to This would reflect that our choice of a simple simulated an- small sets E ranging from 8 to 14 events, where all workers nealing algorithm is a fair enough solution in our context (resp., services) are pairwise different, resulting in a very of multi-objective optimization ; sparse workers services matrix. Computation times are • we should observe that the decrease for each stakeholder reported in Figure 3 ; is moderate and comparable to the one observed in other (2) how is the responsibility distributed and can this measure works such as [18], indicating that our approach manages be used in the context of recommender Tableau system 1 to discrim- Nb events 8 9 10 11 12 13 Execution inate time between events to explain a recommendation? Are to accommodate all stakeholders simultaneously and with (s) 0,87504697 1,84435916 3,8634882 8,24795723 17,6787393 37,6473985 approximately the same effectiveness for each of them ; all(s)events Training time equally discriminant 0,22704959 0,68398285 regarding 1,58830833 this measure? 3,40311575 7,07547235 14,4389184 This paper answers the question by computing for each Results. As can be seen in Table 3, our approach obtains sat- event the minimum size of its contingency set min |Γ| as isfactory optimization results in terms of NDCG with approxi- defined in Section 5.1. Then for all events, and depending mately 0.82 as a global optimization score when weights α are on the size of |E|, a distribution for min |Γ| is computed. uniform. As expected, our approach proves to be efficient when Results are presented in Figure 4. biasing the convergence towards one of the stakeholder or the other by adjusting the weights accordingly. Interestingly, when Execution time (s) Training time (s) favoring one stakeholder, the global NDCG decreases reflecting 180 that both stakeholders had contradictory profiles. To provide the reader with a range for acceptable NDCG values, 135 it can be noticed that [18] have achieved very high NDCG score (above 0.95) on a tweaked MovieLens dataset where providers 90 have been set manually for each movie. We have reproduced this data set and on the aforementioned 2 stakeholders settings and 45 by tuning system preferences based on fairness we have achieved 0.96 NDCG which proves that simulated annealing paired with NDCG are a fair choice for the problem of multi-stakeholder 0 8 9 10 11 12 13 14 15 optimization. Number of events 6.2 Testing explanation generation Figure 3: Executions and training times for different size For those tests, we used Surprise’s off the shelf Baseline only of the events set |E| predictor to compute profiles. The advantage of this predictor is Histogram of event7 Histogram of event14 (a) (b) (iv) The modification of the responsibility measure to include 200 50 the assessment of the perturbations on the ranking brought by 40 causes. 150 Frequency Frequency 30 REFERENCES 100 20 [1] H. Abdollahpouri, G. Adomavicius, R. Burke, I. Guy, D. Jannach, T. Kamishima, 50 J. Krasnodebski, and L. Pizzato. Beyond personalization: Research directions 10 in multistakeholder recommendation. CoRR, abs/1905.01986, 2019. [2] H. Abdollahpouri, R. Burke, and B. Mobasher. Recommender systems as 0 0 0 1 2 3 4 5 6 7 0 2 4 6 8 10 12 14 multistakeholder environments. In UMAP, pages 347–348, 2017. min(Γ) min(Γ) [3] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender event7 event14 systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng., 17(6):734–749, 2005. Figure 4: Distribution of min |Γ| for events set |E| = 8 (left) [4] J. A. Aledo, J. A. Gámez, and D. Molina. Tackling the rank aggregation problem with evolutionary algorithms. Applied Mathematics and Computation, 222:632– and |E| = 14 (right) 644, 2013. [5] H. Chockler and J. Y. Halpern. Responsibility and blame: A structural-model approach. J. Artif. Intell. Res., 22:93–115, 2004. [6] J. Chomicki. Preference formulas in relational queries. ACM Trans. Database The machine configuration for the tests is a Core i7-6500U Syst., 28(4):427–466, 2003. CPU @ 2.50GHz, with 8 Gb of RAM, running Ubuntu 18.04.1 [7] M. Das, S. Amer-Yahia, G. Das, and C. Yu. MRI: meaningful interpretations of LTS. As can be seen from Figure 3, the causal explanation ap- collaborative ratings. PVLDB, 4(11):1063–1074, 2011. [8] T. Eiter and T. Lukasiewicz. Complexity results for structure-based causality. proach expectedly has an exponential time complexity behavior, In Proceedings of the Seventeenth International Joint Conference on Artificial which opens new research direction to avoid the exploration Intelligence, IJCAI 2001, Seattle, Washington, USA, August 4-10, 2001, pages of all contingency sets and make this approach tractable. Inter- 35–42, 2001. [9] R. D. B. et al. Towards multi-stakeholder utility evaluation of recommender estingly, when the size of |E| increases, the training time of the systems. In UMAP, 2016. multi-stakeholder recommender system tends to be negligible [10] F. Gedikli, D. Jannach, and M. Ge. How should I explain? A comparison of different explanation types for recommender systems. Int. J. Hum.-Comput. when compared to the explanations of the recommendation. This Stud., 72(4):367–382, 2014. clearly indicates that causal explanation is even more challenging [11] A. Giacometti, D. H. Li, P. Marcel, and A. Soulet. 20 years of pattern mining: a in terms of optimization than our quite resource intensive RS. bibliometric survey. SIGKDD Explorations, 15(1):41–50, 2013. [12] J. Y. Halpern and J. Pearl. Causes and explanations: A structural-model ap- Then, considering Figure 4 it can be seen that most events in proach: Part 1: Causes. In UAI, pages 194–202, 2001. both scenarios have a majority of (close to) counterfactual events [13] S. Lin. Rank aggregation methods. WIREs Computational Statistics, 2(5):555– (|Γ| ≈ 0) that directly impacts the recommendation process and 570, 2010. [14] A. Meliou, W. Gatterbauer, K. F. Moore, and D. Suciu. The complexity of a minority of sensibly less “responsible” events as shown by the causality and responsibility for query answers and non-answers. PVLDB, 2-modes distribution. This was expected as the events test sets 4(1):34–45, 2010. [15] P. Nguyen, J. Dines, and J. Krasnodebski. A multi-objective learning to re-rank have been built in a way making the workers × services matrix approach to optimize online marketplaces for multiple stakeholders. CoRR, very sparse. abs/1708.00651, 2017. [16] J. Pearl. Causal inference in statistics: An overview. Statist. Surv., 3:96–146, 2009. 7 DISCUSSION [17] S. Roy and D. Suciu. A formal approach to finding explanations for database This paper introduces an on-going work on explaining recom- queries. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, pages 1579–1590, 2014. mendations in a multi-stakeholder context. Our immediate future [18] Ö. Sürer, R. Burke, and E. C. Malthouse. Multistakeholder recommendation work includes the optimization of the computation of responsi- with provider constraints. In RecSYS, pages 54–62, 2018. [19] R. W. White. Interactions with Search Systems. Cambridge University Press, bilities and the evaluation of the approach on the explanation 2016. of σ∗ . Our short term goal is mainly a complete validation of [20] Y. Zhang and X. Chen. Explainable recommendation: A survey and new the current framework, including a user study to validate the perspectives. In Under revision - Available on Arxiv.org, pages 1–88, 2019. [21] Y. Zheng and A. Pu. Utility-based multi-stakeholder recommendations by usefulness of explanations to the different stakeholders. Our long multi-objective optimization. In WI, pages 128–135, 2018. term goal is to define a thorough framework for explanation generation in a multi-stakeholder recommendation context. Such a framework should cover: (i) The clear distinction of endoge- nous and exogenous causes. In Halpern and Pearl’s framework, exogenous variables define a context determined by external, unconcerned factors, deemed not to be possible causes, while endogenous variables are the ones judged to affect the outcome and are thus potential causes. In the present work, we considered all the events as endogenous causes. (ii) The modeling of different types of causes. In the present work, we addressed the problem of modeling why so cause, i.e., explaining why the recommendation is what it is. Other forms of explanations can be addressed, like for instance why not so, i.e., why recommendation is not like this. (iii) The distinction between explanations of profiles and recom- mendations. In the present work, all explanations are generated through interventions over the set of events. Recommendation explanations may be more intuitively understood if we allow in- terventions directly on the stakeholders’ profile (e.g., permuting the ranks of two objects). A challenge is then to investigate how to map such interventions to interventions over the set of objects.