<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Juggler: Multi-Stakeholder Ranking with Meta-Learning∗</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>TIAGO CUNHA</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Expedia Group</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Switzerland IOANNIS PARTALAS</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Expedia Group</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Switzerland PHONG NGUYEN</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Expedia Group</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Switzerland</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Online marketplaces must optimize recommendations with regards to multiple objectives, in order to fulfill expectations from a variety of stakeholders. This problem is typically addressed using Pareto Theory, which explores multiple objectives in a domain and identifies the objective vectors which yield the best performance. However, such approach is computationally expensive, and available commonly only through domain-specific solutions, which is not ideal for online marketplaces and their ever-changing business dynamics. We tackle these limitations by proposing a Meta-Learning framework to address the Multi-Stakeholder recommendation problem, which is able to dynamically predict the ideal settings on how business rules should be mingled into the final recommendations. The framework is designed to be generic enough to be leveraged in any item ranking domain and requires only the definition of a policy, i.e. a set of multi-objective metrics the meta-model should optimize for. The model finds the mapping between the search context and the corresponding best objective vectors. This way, the model is able to predict in real-time which is the best solution for any unforeseen search, and therefore adapt the recommendations on a search-level. We show that under this framework, the range of models one is able to build depends only on how many policies can be defined, thus ofering a virtually unlimited way to address multi-objective problems. The experimental results showcase the generalization abilities of this framework and its highly predictive performance. Furthermore, the simulation results confirm the ability to approximate a policy's expectation in most cases and hints to the potential to use this framework in many other item recommendation problems.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>
        In recent years a surge has been observed in electronic marketplaces as a consequence of the unprecedented growth
of e-Commerce which bring together sellers and buyers in a common platform. Examples of marketplaces are online
travel agencies like Expedia Group or Amazon in the retail domain. Typically in these scenarios, the platforms provide
recommendation systems that help to match users and suppliers. While in traditional recommendation systems one
optimizes for customer utility, in marketplaces one needs to take into account multiple stakeholders. This problem, also
known as Multi-Stakeholder recommendation problem [
        <xref ref-type="bibr" rid="ref2 ref28 ref44">2, 28, 44</xref>
        ], is commonly addressed through a multi-objective
optimization problem, where the collection of stakeholders’ goals is optimized simultaneously. An extensive collection
of works is available, with examples ranging from revenue [
        <xref ref-type="bibr" rid="ref27 ref31 ref6">6, 27, 31</xref>
        ] to fairness [
        <xref ref-type="bibr" rid="ref17 ref28 ref4 ref9">4, 9, 17, 28</xref>
        ].
      </p>
      <p>Despite progress in the domain, the solutions mainly focus on few and domain-dependent objectives, which makes it
hard to transfer knowledge to other domains. Our framework addresses both limitations by providing a way to consider
any and as many objectives as the practitioner desires. The practitioner only needs to define a meaningful policy to
optimize for, meaning a collection of relevant stakeholder objectives. We show that depending on the policy, the model
commonly yields predictions aligned with the policy defined, therefore approaching stakeholders’ expectations. With
such flexible framework, it is possible to define as many and as diferent policies as the business demands, leaving the
practitioner with the sole responsibility of exploring policies, rather than to develop new custom-built algorithms.</p>
      <p>
        To accomplish such generic framework, one needs to revisit how multi-objective problems are tackled. Typical
approaches try to find the Pareto front [
        <xref ref-type="bibr" rid="ref18 ref26 ref41 ref45">18, 26, 41, 45</xref>
        ], which refers to the set of non-dominated solutions in a domain,
∗Copyright 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Presented at the MORS workshop held in conjunction with the 15th ACM Conference on Recommender Systems (RecSys), 2021, in Amsterdam,
Netherlands.
i.e. objective vectors for which maximizing one objective is only possible by being detrimental to the others. Although
efective under few objectives, this poses substantial problems when considering multiple: 1) when the number of
objectives increases, almost all solutions become non-dominated and 2) the number of solutions required for the
approximation of the Pareto front exponentially increases with the number of objectives [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Such constraints impose
substantial restrictions when applied in real-time inference: 1) computational requirements make it extremely dificult
to find tailored solutions for each problem instance, thus having to rely on sub-optimal global solutions and 2) even if
the best solutions are eficiently found, the practitioner is still required to make a manual decision on which solution
from the Pareto front to use [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ].
      </p>
      <p>
        To address all these issues, we argue that the ideal approach is to forego the expensive computation of the Pareto front
and to focus on a set of diverse and meaningful solutions. A predictive model is then charged to decide which is the most
appropriate solution, depending on the context a search occurs. Then, such prediction is used to dynamically adjust the
multi-objective solution. To do so, we draw inspiration from the research area of Meta-Learning for algorithm selection [
        <xref ref-type="bibr" rid="ref39 ref8">8,
39</xref>
        ], where one learns about the learning process in order to predict the best algorithm for a new dataset [
        <xref ref-type="bibr" rid="ref24 ref36 ref37 ref38 ref40">24, 36–38, 40</xref>
        ].
Likewise, we leverage the Meta-Learning framework to predict the best objective vector amongst a set of diverse and
meaningful policy-dependent solutions for any new search. Thus, in this work we make the following contributions: 1)
we propose a general purpose framework Multi-Stakeholder recommendations leveraging Meta-Learning and 2) we
validate the procedure in the hospitality domain using Brand Expedia data.
      </p>
      <p>This document is organized as follows: Section 2 presents a summary of the relevant literature; next, Section 3
introduces the proposed framework while Section 4 shows the experimental setup used to validate it. Section 5 presents
and discusses our findings and section 6 highlights our conclusions and future work.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
    </sec>
    <sec id="sec-3">
      <title>Multi-Stakeholder Recommendations</title>
      <p>
        We define a stakeholder in the recommendation space as any individual or group that can afect or be afected by the
delivery of recommendations to users [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In the context of online marketplaces, they typically refer to three groups:
consumers (customers/users which have a need to be met by the marketplace and which receive the recommendations
to fulfill it), providers (entities which provide goods/services to the customer through the marketplace) and the system
(the platform that matches consumers to providers).
      </p>
      <p>
        In a Multi-Stakeholder recommendations, each stakeholder has a diferent expectation that the platform has to meet,
or at least to satisfy; that is a scenario which is more realistic in the real world that the basic task to optimize for
customer utility in academic research [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. Since such problem is not trivial, much research has been devoted to this
topic [
        <xref ref-type="bibr" rid="ref28 ref44">28, 44</xref>
        ]. More recently, there was a surge of proposed approaches in industrial settings, mainly due to e-Commerce
growth: balancing semantic match and job-seeking intent features [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ], to learn how to balance hotel relevance and
compensation [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] and to balance customer relevance and advertising revenue in music platforms [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ].
      </p>
      <p>
        The Multi-Stakeholder problem can be formulated through the multi-objective optimization setup, where all
stakeholder’s goals are considered as one single super-set of objectives to optimize for. Formally, the problem is defined as [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]:
given  &gt; 1 objective functions 1 :  →− R, . . . ,  :  →− R which map a decision space  into R, a multi-objective
problem is defined as by:
min 1 ( ), . . . , min  ( ),  ∈
      </p>
      <p>
        Traditionally, the multi-objective problem is tackled with Pareto Theory methods, which aims to find the Pareto
front, i.e. the set of non-dominated vectors for any particular problem. Such set describes all vectors for which it is only
possible to perform better in one dimension, if it is accompanied by a decrease in performance in another. Such front is
ideal to uncover the relationship between objectives and, consequently, the best solution. The quintessential notion
in Pareto is the dominance, i.e. how to compare two instances to make sure one is better than the other. It is defined
as [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]: given two vectors in the objective space  (1) ∈ R and  (2) ∈ R , then the point  (1) Pareto dominates  (2)
if and only if: ∀ ∈ {1, . . . , } : (1) ≤ (2) , ∃  ∈ {1, . . . , } :  ( 1) &lt;  ( 2) . More plainly, it states the first vector is not
worse in each of the objectives than the second vector and better in at least one [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ].
      </p>
      <p>
        The typical approach to solve the multi-objective problem consists of two steps: to simplify the objective function
and to apply an optimization algorithm to find the solution. To simplify the problem, practitioners use scalarization
functions, which are mathematical constructs used to convert multiple objective functions into a single one. This way,
the problem can be solved using standard optimization algorithms [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Then, to find the solution, practitioners usually
employ evolutionary algorithms, as these are capable of finding well-distributed solutions along the Pareto front [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
Such algorithms take advantage of natural evolution concepts of selection, mutation and combination to explore the
objective space until a near-optimal solution is found. For more details, see [
        <xref ref-type="bibr" rid="ref26 ref41 ref45">26, 41, 45</xref>
        ].
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Meta-learning</title>
      <p>
        The No Free Lunch Theorem [
        <xref ref-type="bibr" rid="ref43">43</xref>
        ] refutes the existence of a so-called super-algorithm: a single best, universal learning
algorithm able to obtain the best possible performance for every instance of a given task [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ]. The justification lies in
the observation that if all possible data distributions are equally likely, any pair of learning algorithms will, on average,
have the same performance. Therefore, for any algorithm, superior performance over one group of task instances is
compensated with inferior performance over another.
      </p>
      <p>
        Researchers have then decided to focus on understanding each algorithm’s behavior/bias in order to ascertain
when they will be most successful. Meta-Learning is one of the existing tools to tackle the problem, which focuses on
using Machine Learning to understand Machine Learning algorithms (and their configurations) in order to improve
their results in future applications [
        <xref ref-type="bibr" rid="ref32 ref36">32, 36</xref>
        ]. These are commonly addressed by learning meta-models which find the
relationships between data characteristics and learning performance [
        <xref ref-type="bibr" rid="ref36 ref38">36, 38</xref>
        ]. This algorithm selection problem has first
been formalized by Rice [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ], and it states that given:
• the problem space  , representing the set of problem instances for which predictions will be made;
• the feature space ℱ containing measurable characteristics for each instance of  ;
• the algorithm space  as the set of all available algorithms for solving the problem;
• the performance space  that shows the mapping from algorithms to performance metrics,
the problem can be stated as: for a given problem instance  ∈  , with features  ( ) ∈ ℱ , find the selection
mapping  (  ( )) into the algorithm space  , such that the selected algorithm  ∈  maximizes the performance
mapping  ( ( )) ∈  [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ].
      </p>
      <p>
        Using Meta-Learning and Recommender Systems simultaneously is not a new topic: there are works that focus on
the selection of the best recommendation algorithm for a new dataset [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref16 ref5">5, 12–14, 16</xref>
        ], selection of the best algorithm for
each user [
        <xref ref-type="bibr" rid="ref10 ref11 ref33">10, 11, 33</xref>
        ] and even how to use Recommender Systems to tackle algorithm selection tasks [
        <xref ref-type="bibr" rid="ref15 ref30">15, 30</xref>
        ]. However,
to the best of our knowledge, this is the first documented solution to tackle the Multi-Stakeholder recommendation
problem using Metalearning.
      </p>
    </sec>
    <sec id="sec-5">
      <title>3 JUGGLER</title>
      <p>We focus on the problem of item ranking in online marketplaces, namely how to rank multiple items given the customer’s
query or based on the customer profile. In this setting, a ranking  is defined by decreasingly sorting items  ∈  by
their score  ().</p>
      <p>
        In this work, we define our objective function as a sum of all stakeholder’s objectives, by building on the previous
work [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. The score  for any item  in this context is given as:
      </p>
      <p>() = Õ  () (2)</p>
      <p>=1
where  refers to each of  adjustments, i.e. functions which score each item with regards to some specific objective(s).
Usually these refer to mathematical formulas to enforce business rules, but can be more complex: for instance, the
predictions from other Machine Learning models. Thus, this formulation is flexible enough to allow any number and
nature of adjustments, which is advantageous in ever-changing online platforms.</p>
      <p>
        This formulation ofers a straightforward way to address the multi-objective optimization problem, namely through
a linear scalarization [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], i.e. by introducing an individual weight per adjustment. This is achieved by modifying
Equation 2 accordingly:
      </p>
      <p>() = Õ</p>
      <p>×  ()
=1
where each parameter  identifying a specific adjustment’s weight. From such definition, it is possible to define a
ranking of items, simply by sorting the items for a given user in a decreasing order. Thus, the ranking  over a subset of
items  ⊆  referring to search is the sequence of items given by:  = (1, . . . ,  ),  (1) ≥  ( ), ∀ ∈ . Now, inspired
by Equation 1, we can state our objective function to optimize the weighted item scores in a way that all objectives are
maximized:
max 1 ( ), . . . , max  ( )
(3)
(4)
3.1</p>
    </sec>
    <sec id="sec-6">
      <title>Framework overview</title>
      <p>The main hypothesis in this paper is that Multi-Stakeholder recommendations can be addressed through Meta-Learning.
We take inspiration from algorithm selection task and adapt it to our problem, by considering problem instances 
as individual searches  and replacing algorithm  by objective vector  . This parallelism will become evident in
Section 3.3, when we explain how we convert objective vectors into a discrete set of meta-labels.</p>
      <p>This way, the Meta-learning problem addressed is: for a given search  ∈  , with features  () ∈ ℱ , the meta-model
aims to find the selection mapping  (  ()) →  , such that the selected objective vector  ∈  maximizes the
performance mapping  ( ()) ∈  . Figure 1 presents an overview of the training and inference workflows in the
proposed Juggler framework, in order to illustrate how it can be leveraged in online marketplaces.</p>
      <p>In training, we leverage historical searches to build the meta-examples, i.e. a set of points in the meta-feature space,
with an associated label. Each point refers to a search and it is defined by the meta-feature extraction process; the
labels are constructed via simulations, by assessing which is the ideal multi-objective vector calibration for each search.
Historical
searches</p>
      <p>New
search
—-&gt; Training data flow
- -&gt; Inference data flow</p>
      <sec id="sec-6-1">
        <title>Perform simulations</title>
      </sec>
      <sec id="sec-6-2">
        <title>Extract metafeatures</title>
      </sec>
      <sec id="sec-6-3">
        <title>Re-rank items</title>
      </sec>
      <sec id="sec-6-4">
        <title>Model training</title>
      </sec>
      <sec id="sec-6-5">
        <title>Metaexamples</title>
      </sec>
      <sec id="sec-6-6">
        <title>Model iftting</title>
      </sec>
      <sec id="sec-6-7">
        <title>Meta-model</title>
      </sec>
      <sec id="sec-6-8">
        <title>Apply weights</title>
      </sec>
      <sec id="sec-6-9">
        <title>Model inference</title>
        <p>With such meta-examples, the meta-model is fitted using a standard Machine Learning procedure. When placed into
production, it is ready to perform real-time inference.</p>
        <p>The inference step involves then simply submitting a new search to the same meta-feature extraction process used
in training, in order to be able to place the new search in the meta-feature space. Then, the meta-model predicts the
best multi-objective vector for this particular search, based on patterns found in the simulations employed in training.
The prediction is used to re-rank the items, following the predefined scoring system.</p>
        <p>We shall discuss in the remainder of this section how do we perform simulations (by introducing the concepts of
policy and how these are used to define meta-labels) and how do we characterize searches through meta-features.
3.2</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Meta-performance: defining policies</title>
      <p>We will describe now the set of metrics  that we want to optimize with Meta-Learning. As established in the
MultiStakeholder literature, there are many objectives one can optimize for at any given time: it is then a question to define
which matter most in a specific use case. Common solutions focus on few objectives, highly dependent on the problem
at hand. We address this limitation by defining the concept of policy.</p>
      <p>
        A policy Γ refers to a set of multi-objective metrics. Such metrics are already naturally defined in each multi-objective
function  in Equation 4, i.e. Γ = {1, . . . ,  }. In our setup there is no constraint over which metrics nor how many
metrics one chooses to optimize for, as long as a valid sets of metrics is provided. Following theoretical considerations on
ranking metrics [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ], we consider a metric to be valid if it possesses the distinguishability property, i.e. it can distinguish
two diferent rankings. We require only its output to be a continuous scalar value, with higher value meaning better.
5
      </p>
      <p>
        Many widely-accepted ranking metrics exists today follow such constraints: for instance, Normalized Discounted
Cumulative Gain (NDCG), Mean Average Precision and Mean Reciprocal Rank [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. These, however, are not able to
capture all the dynamics of the multi-objective recommendation problem, for instance in terms of fairness or how well
aligned the outcome is with a specific business objective. To address this gap, we propose 2 diferent metric classes for
rankings to be in combination and separately in our framework: Ranking Correlations and Ranking Fairness. The goal
is to provide a skeleton to enable to optimize for objectives for which there are no known metrics yet.
      </p>
      <p>Ranking Correlations. This metric class ofers a way to understand how well does a specific ranking perform on
a given attribute, when compared against the ideal setting. For instance, how correlated a given search is with the
increasing price-aware sorting of the same items tells us how close we are to a ranking with cheaper properties on
the top positions. Similarly, such operation can be used for any item attribute which enables to derive a ranking of
items. Since we shall operate under a ranking setup, we take advantage of Kendall’s  ranking correlation to measure
correlations to the ideal scenario. Formally, the correlation between a ranking  and the ideal ranking  ∗, given by a
ranking function over a particular attribute, is calculated by first assessing how many concordant  and discordant 
pairs there are when comparing pairwise items in both rankings. Kendall’s  is thus given by:
 =</p>
      <p>− 
 ( − 1)/2</p>
      <p>
        Ranking Fairness. One type of objectives that has recently received substantial attention in the literature is the
fairness and diversity [
        <xref ref-type="bibr" rid="ref17 ref19 ref2 ref9">2, 9, 17, 19</xref>
        ]. Such approaches try to ensure the distribution of predicted items per class is
respected regardless of the context on which the prediction occurs. For instance, we could optimize for a ranking to
show all existing property types in the destination region, while trying not to hurt relevance of predictions. To that
efect, we build on a Group Fairness metric [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], originally designed to ensure fairness of item popularity, and generalize
it to ensure fairness of any item attribute. Considering an attribute which can be divided into  classes, with each
group being represented by   ,  ∈ {1, . . . ,  }. Thus, one can calculate the Group Fairness metric by considering the
frequency of items in each class:
      </p>
      <p>() = Õ p| |, ∀ ∈</p>
      <p>=1</p>
      <p>With such metric classes, one can define a variety of metrics, which in turn means it is trivial to define multiple
policies. One key advantage of this formulation is that multiple meta-models can be inferred simply by switching the
policy. This is particularly noteworthy since it means any policy can be used, even though it might at times lead to
sub-optimal performance: it is up to to practitioner to decide which are the meaningful objectives to optimize for and to
confirm through simulations and AB testing whether the expected outcome is met. Alternatively, such policy could be
defined by product or analytics teams, based on domain knowledge and business dynamics.
3.3</p>
    </sec>
    <sec id="sec-8">
      <title>Meta-labels: choosing weights</title>
      <p>To find the objective vectors  to be used as meta-labels is no trivial task and depends heavily on the policy Γ: it is
unlikely that two policies should share exactly the same objective vectors. To that end, we define a procedure which
aims to explore a constrained objective space in order to select a few good candidates. These candidates constitute all
(5)
(6)
available options the model has to dynamically weight the adjustments at inference time, thus reducing the problem
from finding ideal weights to predict the best candidate. The procedure to find the meta-labels follows these steps:
(1) To constrain each objective into a range: each adjustment  must be constrained within a lower and higher
boundary, i.e. [min( ), max( )]. The boundaries should be defined in such a way that the adjustment is
allowed to only vary so much from the original value (i.e.  = 1) - otherwise, it may lead to unforeseen negative
efects on the overall ranking. Such range should be tuned through experimentation and/or domain knowledge.
(2) To partition the objective space into sections: instead of exploring all possible combinations in the feasible
region, we focus on the most interesting sections: the extreme ends of the bounded objective space (to explore
the objective space in search of meaningful changes in ranking) or closer to the default values (to find the best
ift for the current default setup). Thus, each space is defined as a polygon, covering 1/3 of each dimension, i.e.
(max( ) − min( ))/3. The outer polygons start from the extreme ends of objective range, while the balanced
section is centered on the origin point. As an example of the partition of an objective space for 2 objectives
is shown in Figure 2. Notice this procedure generates 5 meta-labels when using 2 objectives, which is a good
trade-of for the classification algorithm we aim to employ.</p>
      <p>h
g
i
h
m
u
i
d
e
m
w
o
l</p>
      <p>low medium high
Fig. 2. Meta-label sections in an objective space for 2 objective functions. Notice it identifies 5 sections (shaded in blue), which will
translate in the same number of meta-labels.</p>
      <p>
        (3) To discretise the space into representative candidates: we sample some examples from each section to serve
as candidates as, once again, there is not necessarily any change in ranking by considering 2 contiguous objective
vectors. To do so, we employ a grid-search approach, similarly to another procedure in the literature [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ].
(4) To create solutions using candidates: a solution  refers to a set of objective vectors, containing exactly one
candidate per section, i.e.  = {1 , . . . , 5 } with  referring to one selected objective vectors in each of the 5
available sections. The set of all solutions  refers to all combinations of candidates across sections.
(5) To evaluate solutions: all solutions are evaluated on policy Γ in all searches. First, each objective vector in each
solution is submitted to Equation 3 in order to score and, afterwards, rank each item within a search. Then, each
candidate is evaluated individually in each objective  with regards to each search, which enables to calculate
its average ranking across all objectives.
(6) To identify the best solution per policy: the best solution is simply the one which achieves the lowest average
rank across all objectives. In the presence of ties, the candidate closer to the default weights is selected, since
there is no gain in deviating from the default behaviour. Algorithm 1 summarizes the evaluation procedure
explained in the previous two points.
⊲ Score items
⊲ Rank items
⊲ Evaluate candidate
⊲ Score solution in search
⊲ Evaluate over all searches
⊲ Select best solution
3.4
      </p>
    </sec>
    <sec id="sec-9">
      <title>Meta-features: characterizing context</title>
      <p>One must now define the feature space ℱ which will allow to describe the context on which each search occurs.
Meta-Learning literature presents several approaches to describe a problem, but as far as we are aware, there are no
standard solutions to describe rankings. We propose two meta-feature classes, similarly to what was done for metrics in
Section 3.2, which can be used on any ranking: Ranking histograms and Ranking comparisons.</p>
      <p>
        Ranking histograms. To take advantage of histograms in Meta-Learning is not new [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] and it is appreciated since
they are capable of summarizing a given feature  using a single summary function, i.e. the ratio of elements by bucket.
It requires, however, a significant amount of data in order to extract meaningful patterns - to do so, we focus on a
specific ranking threshold  to extract data from, large enough to be meaningful. Thus,
  = Histogram({ ()}), ∀ ∈  , ∀ ∈  :  () &lt; 
(7)
      </p>
      <p>Ranking comparisons. Here, we shift the attention from the item features to item scores: we achieve this by comparing
the same item set in a search using diferent scoring mechanisms, i.e. using subsets of the adjustments from Equations 2
and 3. Thus, considering two rankings 1 and 2 of the same items, constructed with diferent scoring functions, we
employ multiple functions ranking comparison functions to extract summary statistics of their similarity. Examples of
such metrics include correlation metrics such as Kendall’s  (see equation 5), Cosine similarity, Pearson Correlation, etc.
Formally, for a set of ranking correlation functions Ω, we have:</p>
      <p>= { (1, 2)}, ∀ ∈ Ω (8)</p>
      <p>The full set of meta-features is then ℱ =   ∪   , which may translate into a set of hundred or even thousands
of meta-features, depending on how many diferent parameters are used to instantiate the meta-feature classes. To
tackle this issue, the procedure assumes the application of a feature selection process, which shall find the best set
of meta-features per policy. We make no assumptions on the feature selection procedure, but focus particularly on
correlation feature selection.</p>
    </sec>
    <sec id="sec-10">
      <title>EXPERIMENTAL STUDY</title>
      <p>
        (ℎ) = utility(ℎ) + compensation(ℎ) + . . . (9)
where utility refers to the relevance each item holds to meet the customer’s needs and compensation refers to an
adjustment responsible to penalize lower margin properties [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. Notice the formula is incomplete, meaning there are
many hidden adjustments in place to account for many other marketplace objectives. We account for all adjustments in
the results presented in this document, which makes the problem both more realistic and more dificult to solve. Notice,
for instance, that otherwise we could not use the same weight on both adjustments, as it would always yield the same
ranking outcome, rending some combinations worthless.
The data used for meta-model fitting refers to a sample of 6 million searches from 2019. The data is split into
training/tuning/validation subsets by randomly sampling searches according to the 70/10/20 rule, respectively.
      </p>
      <p>Each search is a collection of properties, and it has both search-specific attributes and property-specific attributes.
The search is also accompanied by the utility scores per property and has all features required to compute the remaining
adjustments. The complete list of attributes can be organized as follows:
• Search features: id, destination (identifier, country and type), check-in/out derived signals (day of week, week of
year, month), booking window and length of stay. These require no pre-processing, thus becoming meta-features
by default.
• Property features: price, margin, star rating, guest rating, distance to center, production statistics, etc. We use
these attributes as features  in   meta-features, with  = 30. No other thresholds have been used for
simplicity.
• Property scores: utility, compensation and all its variations are calculated for all properties. Then, the following
 functions will be used to compare rankings: Kendall’s  , Pearson’s correlation and Cosine similarity.</p>
      <p>Afterwards, models are simulated using a more recent data sample with 100000 searches, which covers the period of
January 2020 until March 2021. The features are exactly the same as before.
4.2</p>
    </sec>
    <sec id="sec-11">
      <title>Policies</title>
      <p>Following the convention from Equation 3, we study the Multi-Stakeholder problem through the following rule:
 (ℎ) =  × utility(ℎ) +  × compensation(ℎ) + . . . (10)
defining variables  and  , responsible for weighting utility and compensation scores, respectively.</p>
      <p>We take advantage of NDCG as a measure of customer utility, given it is a standard choice in Recommender Systems.
Furthermore, we explore Ranking Correlations and Ranking Fairness through Kendall’s  and Group Fairness metrics,
presented in Equations 5 and 6. These will be used to further explore the rankings produced during simulations, in an
attempt to understand how well do we perform in specific business objectives.</p>
      <p>9</p>
      <p>In Ranking Correlations, we focus on specific attributes to create the ideal rankings: lowest price, highest margin,
lowest distance, highest guest rating and highest number of reviews. In order to ensure all objectives are optimized, the
dimensions which aim for lower values - price and distance - are ranked in increasing order. For simplicity, we refer to
these as  ,  ,  , ,  , respectively.</p>
      <p>Group Fairness focus on specific marketplace components we wish to improve upon by ofering a more balanced
solution of each category, namely: markets within destination, property types, property recency in marketplace (in
years), property production statistics and property branding versus non branded properties. These will be denoted as
  ,     ,    ,    ,   henceforth.</p>
      <p>The objectives can thus assigned to the following stakeholders:
• Customers: associated with conversion and property quality metrics such as NDCG,  ,  , ,

• Providers: which aim mostly to have a fair exposure rate, regardless of their characteristics, are represented by
all Group Fairness metrics, i.e.   ,     ,    ,    and   .
• System: which focus mainly in conversion and revenue - namely through NDCG and  - although it
extends its interest also to Group Fairness metrics to maintain marketplace health.</p>
      <p>To make sure to design policies that are able to address multi-objective problems in a meaningful way, it is important
to focus on competing objectives, ideally addressing all stakeholder’s objectives. One way to ascertain such competition
is to ensure low correlation between objectives. The correlation matrix for all objectives proposed is shown in Figure 3.
With the exception of Group Fairness objectives, all other candidates have low correlation amongst themselves.
• Policy II: aims to find the best ranking in terms of customer relevance and marketplace fairness, thus tending to
customer and partner’s interests, i.e. NDCG,   ,     ,    ,    ,   . In this case, we
forego the System’s own interests, attempting to straighten the relationship between Consumers and Producers.
• Policy III: focus simply on the metrics which are directly optimized by the utility and compensation
adjustments, i.e. NDCG,  . The goal is to understand if optimizing few and adjustments directly correlated with
the objectives is an easier task than to optimize for a larger set of objectives, with diverse set of goals.
• Policy IV: focus mostly on customer relevance, with no direct metric aiming for compensation adjustment: i.e.</p>
      <p>NDCG,  , . Here we attempt to observe how does the process behave when we disregard both System
and Producers objectives.
• Policy V: aims to optimize only partner’s fairness objectives, i.e.   ,     ,    ,    ,   .</p>
      <p>Here, we disregard any objective regarding Customer and System, thus trying to understand how much can
fairness be improved using the proposed approach and what are its efects on all stakeholders.
• Policy VI: tries to find the best overall result for all objectives, thus testing the outcome of using as many objectives
as desired: i.e. NDCG,  ,  ,  , ,  ,   ,     ,    ,   ,    .
• Policy VII: aims to find a good balance between conversion and fairness, while making sure the results are price
sensitive, i.e. NDCG,  ,     ,    ,   . The goal is to understand how does the framework
behave without any direct optimization for System objectives.
4.3</p>
    </sec>
    <sec id="sec-12">
      <title>Meta-labels</title>
      <p>Following the procedure from Section 3.3, we have set the following ranges:  ∈ [0.8, 1.5] and  ∈ [0.3, 1.5]. The
decision has been made purely from domain knowledge and our wish to not deviate considerably from the default
settings. Furthermore, considering there are 2 adjustments, the sections defined follow the example presented in Figure 2.
Table 1 presents the respective weight ranges per section, with increments of 0.3 and 0.4 for  and  , respectively.</p>
      <p>Each section is explored with a grid size of 0.1 × 0.1, leading to 12 candidate points per section. The best solution per
policy is presented in Table 2, using the notation  / .</p>
      <p>
        The results show there are indeed diferent candidate preferences, depending on the policy, which validate the
procedure employed. However, a couple of interesting results must be highlighted: 1)  = 0.8 is always used for
low/low and low/high sections and medium/medium section always assigns  = 1, but with diferent  . This shows
a couple of things: the lowest end of the utility range could be even lower, to allow further discrimination amongst
policies (we refrain from doing so in order to ensure a minimum customer relevance score) and the default setting
 =  = 1 is not always the ideal scenario across policies - it seems to work in utility but not in compensation.
The meta-models are fitted using lightGBM [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and tuned using hyperopt [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Notice only this algorithm’s performance
is reported, since we wish to evaluate the framework’s overall behaviour and not find the best tuned model. Thus, we
have chosen a model which has provided good performance overall in ofline testing, although we do not exclude the
hypothesis another model could perform better, depending on the policy employed. Furthermore, each meta-model
is compared only against majority voting baseline, which always predicts the most frequent label. This is the only
baseline we use because we could not find another direct baseline for this problem. In the simulations, we compare
against default settings ( =  = 1).
5
5.1
      </p>
    </sec>
    <sec id="sec-13">
      <title>RESULTS</title>
    </sec>
    <sec id="sec-14">
      <title>Meta-model predictive performance</title>
      <p>performance metrics: precision, recall, F1, False Positive Rate (FPR) and Area Under Curve (AUC).</p>
      <p>The results show Juggler consistently outperforms the baseline in all metrics, showing it is the better option to
correctly predict the meta-labels assigned. We shall inspect the value of such predictions in Section 5.3.</p>
    </sec>
    <sec id="sec-15">
      <title>5.2 Meta-feature importance</title>
      <p>Table 4 presents the distribution of meta-features in Juggler for each policy. The procedure counts how frequently
each feature appears in tree nodes, with a higher value meaning it is more important. The results are grouped by
meta-feature class for readability purposes.</p>
      <p>ℱ
Search</p>
      <p>The results show the search features are the most frequent features (&gt; 70%) in all Juggler models, an expected
result given the segmentation ability of multiple categorical variables - in fact, the destination identifier accounts for
the majority of such frequency. On the other hand, ranking correlation meta-features   are almost negligible. A
justification for this behaviour may come from the fixed number of properties used to create the rankings (i.e. only
top 30 properties). Future work should address this issue by considering diferent thresholds. The ranking histogram
meta-features   account for 18 − 30% of feature frequency, thus establishing themselves as a valuable meta-feature.</p>
    </sec>
    <sec id="sec-16">
      <title>5.3 Simulations</title>
      <p>Figure 4 presents the distribution of predicted meta-labels, represented by respective sections for readability purposes.
100000
80000
60000
40000
20000
0</p>
      <p>I</p>
      <p>II</p>
      <p>III</p>
      <p>V</p>
      <p>VI</p>
      <p>VII
IV
policy
low/high
high/high
high/low
low/low</p>
      <p>medium/medium</p>
      <p>We observe most policies take advantage of all 5 sections, which is an indicator this is a suitable solution to define
meta-labels. Good examples can be observed in policies I, II, V, VI and VII. Notice also how the medium/medium section
is typically one of the sections with less predictions. This is in itself a justification to use dynamic weight prediction,
the key contribution of our work.</p>
      <p>There are however, some policies with significant class imbalance, most notably policy III: the vast majority of
predictions fall under the low/high section. Such problem could potentially be addressed through increasing the
granularity of grid-search when defining sections, although we cannot exclude the hypothesis that the policy is
ill-defined.
5.4</p>
    </sec>
    <sec id="sec-17">
      <title>Multi-Stakeholder impact</title>
      <p>We inspect now the impact of predictions in terms of the dimensions explored by the policies. Table 5 presents the
percent changes against the default method ( =  = 1) in top 30 properties with regards to multiple dimensions.</p>
      <sec id="sec-17-1">
        <title>Metric</title>
        <p>NDCG
Price
Margin
Rating
Distance
Reviews</p>
        <p>Market
Production</p>
        <p>Type
Recency
Brand
• Policy I - improves in all metrics in policy, most notably in distance and price. However, the fairness metrics are
negatively afected, particularly in property type and sub-market. Although this policy works as expected, the
impact is too negative on the Providers side, meaning it will negatively afect our relationship with them and
potentially lead to withdrawal from the System.
• Policy II - does not work as expected: it shows a decrease in all objective metrics, specially in terms of fairness.</p>
        <p>It shows, however, improvement in price, margin, rating and distance. One possible justification is that the
current set of adjustments has been designed and tuned from System’s point of view, and leads to sub-optimal
performance when we try to tune it for a diferent perspective.
• Policy III - improves in margin, but with negative efect on NDCG. Indirectly, it improves all fairness metrics,
but conversion is severely impaired. This behaviour points out to the overall strength of the compensation
adjustment, which seems to overpower at times the utility score - to address this issue, one could retry the
experiment with more constraints on the compensation adjustment weight range.
• Policy IV - improves in all metrics, plus in margin and reviews. It seems such policy optimizes also for System’s
objectives, which is an interesting finding. However, it introduces however the most negative efect in fairness
metrics across policies - this behaviour may lead, like in Policy I, to risk of Producers leaving the System, which
will contribute to negative long term value.
• Policy V - fails in all metrics in policy (and most of the remaining), except for margin. However, the resulting
policy is the one which deviates the least from the default ranking - this indicates there may be issues in
optimizing for fairness objectives without any specific fairness adjustment included in the equation. It may also
mean the existing fairness metrics are not ideal to be optimized in this setup, although such observation requires
more experimental data to be supported.
• Policy VI - improves margin, rating and distance but it is worse in all other metrics. The results seem to indicate,
like in Policy III, that the subset of weights selected for the compensation weight is not ideal as it allows the
compensation adjustment to overpower the remaining adjustments.
• Policy VII - improves NDCG and price, but it degrades substantially in fairness metrics. It also improves ratings,
margin, distance and reviews. This policy performs well overall, although it also highlights the issue on the
exclusion of fairness based adjustments from the equation.</p>
        <p>Overall, the simulation results show a decent behaviour against the policy expectations, with only 2 policies failing on
most or all accounts: policies II and V. Such policies have in common the fact that they optimize for all fairness metrics,
which may be an indication that the proposed Group Fairness function in Equation 6 is not the most suitable definition
for this scope. Also, it could mean that the non-existence of a fairness adjustment to weight using the meta-model,
would make such optimization troublesome. It would be interesting to try diferent fairness definitions in future work
to understand whether this issue can be addressed in such policies.</p>
        <p>Another interesting observation comes from the fact that not all policies are easily optimizable. For instance, while
policies I and IV optimize all metrics, policies III and VI are able to optimize only in a few objectives. This happens
because: 1) the hidden adjustments in Equation 10 have great impact on the overall results (these are most noticeable
when the weights predicted are lower than the default value) and 2) the predicted weights afect all items in a search
equally, meaning it has limited impact on how much can be optimized. Both points can be addressed by including the
remaining adjustments under this framework.</p>
        <p>Regarding stakeholder’s impact, it seems policies are better at improving recommendations for Customers and
System than for Providers: in fact, only in policy III are fairness metrics greatly improved, even though they are not
included in the policy. This means the current recommendation procedure can only increase fairness by increasing
margin. To address this issue, future work could weight directly fairness adjustments through Juggler using global
weights or weight each property individually across all adjustments - this would allow more granular corrections using
Juggler and a potential better approximation to all stakeholder’s objectives.
6</p>
      </sec>
    </sec>
    <sec id="sec-18">
      <title>CONCLUSIONS</title>
      <p>This paper has presented a Meta-Learning approach to address the Multi-Stakeholder recommendation problem. We
take advantage of historical transactions and simulations to understand which are the best weights per search, given a
specific policy. The results show the models have high predictive performance and simulations have shown it is able
to successfully improve multiple objectives, depending on the chosen policy. There are however many issues to be
improved, namely: replace manually defined meta-features with embeddings, to identify more and better metrics to
be optimized in policies, to use Reinforcement Learning models able to perform exploration of objective vectors per
search (for instance, Multi-Armed bandits) and to improve the policy definition mechanism using constraints and/or
the definition of primary and secondary objectives, in order to attempt to clarify assignment of best labels.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Himan</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          , Gediminas Adomavicius, Robin Burke, Ido Guy, Dietmar Jannach, Toshihiro Kamishima, Jan Krasnodebski, and
          <string-name>
            <given-names>Luiz</given-names>
            <surname>Pizzato</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Multistakeholder recommendation: Survey and research directions. User Modeling and User-Adapted Interaction 30, 1</article-title>
          (may
          <year>2020</year>
          ),
          <fpage>127</fpage>
          -
          <lpage>158</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Himan</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          and
          <string-name>
            <given-names>Robin</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Multi-stakeholder Recommendation and its Connection to Multi-sided Fairness</article-title>
          .
          <source>Technical Report</source>
          . arXiv:
          <year>1907</year>
          .13158v1
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Himan</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          , Robin Burke, and
          <string-name>
            <given-names>Bamshad</given-names>
            <surname>Mobasher</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Recommender systems as multistakeholder environments</article-title>
          .
          <source>In UMAP 2017 - Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization</source>
          .
          <volume>347</volume>
          -
          <fpage>348</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Himan</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          and
          <string-name>
            <given-names>Steve</given-names>
            <surname>Essinger</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Multiple stakeholders in music recommender systems</article-title>
          .
          <source>arXiv:1708.00120</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Gediminas</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jingjing</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Impact of data characteristics on recommender systems performance</article-title>
          .
          <source>ACM Transactions on Management Information Systems 3</source>
          ,
          <issue>1</issue>
          (
          <year>2012</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Amos</given-names>
            <surname>Azaria</surname>
          </string-name>
          , Avinatan Hassidim, Sarit Kraus, Adi Eshkol, Ofer Weintraub, and
          <string-name>
            <given-names>Irit</given-names>
            <surname>Netanely</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Movie Recommender System for Profit Maximization</article-title>
          .
          <source>Technical Report.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bergstra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yamins</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Cox</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures</article-title>
          .
          <source>In Proceedings of the 30th International Conference on International Conference on Machine Learning</source>
          - Volume
          <volume>28</volume>
          (
          <issue>Atlanta</issue>
          ,
          <string-name>
            <surname>GA</surname>
          </string-name>
          , USA).
          <source>JMLR.org, I-115-I-123.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Pavel</given-names>
            <surname>Brazdil</surname>
          </string-name>
          , Christophe Giraud-Carrier,
          <string-name>
            <given-names>Carlos</given-names>
            <surname>Soares</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Ricardo</given-names>
            <surname>Vilalta</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Metalearning: Applications to Data Mining (1 ed</article-title>
          .). Springer.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Robin</given-names>
            <surname>Burke</surname>
          </string-name>
          , Nasim Sonboli, and
          <string-name>
            <surname>Aldo</surname>
          </string-name>
          Ordoñez-Gauger.
          <year>2018</year>
          .
          <article-title>Balanced Neighborhoods for Multi-sided Fairness in Recommendation</article-title>
          .
          <source>Technical Report.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Collins</surname>
          </string-name>
          , Joeran Beel, and
          <string-name>
            <given-names>Dominika</given-names>
            <surname>Tkaczyk</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>One-at-a-time: A Meta-Learning Recommender-System for Recommendation</article-title>
          . arXiv:
          <year>1805</year>
          .12118
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Collins</surname>
          </string-name>
          , Laura Tierney, and
          <string-name>
            <given-names>Joeran</given-names>
            <surname>Beel</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Per-Instance Algorithm Selection for Recommender Systems via Instance Clustering</article-title>
          .
          <source>Technical Report.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Tiago</surname>
            <given-names>Cunha</given-names>
          </string-name>
          , Carlos Soares, and
          <string-name>
            <surname>André</surname>
            <given-names>C.P.L.F.</given-names>
          </string-name>
          <string-name>
            <surname>Carvalho</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Metalearning for Context-aware Filtering: Selection of Tensor Factorization Algorithms</article-title>
          .
          <source>In Proceedings of the 11th ACM Conference on Recommender Systems. ACM</source>
          ,
          <volume>14</volume>
          -
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Soares</surname>
          </string-name>
          , and A.
          <string-name>
            <surname>C.P.L.F. de Carvalho</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Metalearning and Recommender Systems: A literature review and empirical study on the algorithm selection problem for Collaborative Filtering</article-title>
          .
          <source>Information Sciences</source>
          <volume>423</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Tiago</surname>
            <given-names>Cunha</given-names>
          </string-name>
          , Carlos Soares, and
          <string-name>
            <surname>André C.P.L.F. de Carvalho</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Selecting Collaborative Filtering algorithms using Metalearning</article-title>
          .
          <source>In European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          .
          <volume>393</volume>
          -
          <fpage>409</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Tiago</surname>
            <given-names>Cunha</given-names>
          </string-name>
          , Carlos Soares, and
          <string-name>
            <surname>André C P L F de Carvalho</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>CF4CF: Recommending Collaborative Filtering Algorithms Using Collaborative Filtering</article-title>
          .
          <source>In Proceedings of the 12th ACM Conference on Recommender Systems</source>
          .
          <volume>357</volume>
          -
          <fpage>361</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Ekstrand</surname>
          </string-name>
          and
          <string-name>
            <given-names>John</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>When Recommenders Fail: Predicting Recommender Failure for Algorithm Selection and Combination</article-title>
          .
          <source>In Proceedings of the 6th ACM Conference on Recommender Systems. ACM</source>
          ,
          <volume>233</volume>
          -
          <fpage>236</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Michael</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Ekstrand</surname>
            , Robin Burke, and
            <given-names>Fernando</given-names>
          </string-name>
          <string-name>
            <surname>Diaz</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Fairness and discrimination in recommendation and retrieval</article-title>
          .
          <source>In Proceedings of the 13th ACM Conference on Recommender Systems. ACM, Inc</source>
          ,
          <fpage>576</fpage>
          -
          <lpage>577</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Michael</surname>
            <given-names>T M</given-names>
          </string-name>
          <string-name>
            <surname>Emmerich and André H Deutz</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A tutorial on multiobjective optimization: fundamentals and evolutionary methods</article-title>
          .
          <source>Natural Computing</source>
          <volume>17</volume>
          (
          <year>2018</year>
          ),
          <fpage>585</fpage>
          -
          <lpage>609</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Ruoyuan</given-names>
            <surname>Gao</surname>
          </string-name>
          and
          <string-name>
            <given-names>Chirag</given-names>
            <surname>Shah</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>How fair can we go: Detecting the boundaries of fairness optimization in information retrieval</article-title>
          .
          <source>In ICTIR. ACM</source>
          ,
          <volume>229</volume>
          -
          <fpage>236</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Jonathan</surname>
            <given-names>L Herlocker</given-names>
          </string-name>
          ,
          <article-title>Joseph a</article-title>
          . Konstan, Loren G Terveen, and John T Riedl.
          <year>2004</year>
          .
          <article-title>Evaluating collaborative filtering recommender systems</article-title>
          .
          <source>ACM Transactions on Information Systems 22</source>
          ,
          <issue>1</issue>
          (
          <year>2004</year>
          ),
          <fpage>5</fpage>
          -
          <lpage>53</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Hisao</surname>
            <given-names>Ishibuchi</given-names>
          </string-name>
          , Noritaka Tsukamoto, and
          <string-name>
            <given-names>Yusuke</given-names>
            <surname>Nojima</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Evolutionary Many-Objective Optimization: A Short Review</article-title>
          .
          <source>In Proc. of 2008 IEEE Congress on Evolutionary Computation</source>
          .
          <fpage>2424</fpage>
          -
          <lpage>2431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Dietmar</given-names>
            <surname>Jannach</surname>
          </string-name>
          and
          <string-name>
            <given-names>Gediminas</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Recommendations with a purpose</article-title>
          .
          <source>In Proceedings of the 10th ACM Conference on Recommender Systems</source>
          .
          <volume>7</volume>
          -
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>A</given-names>
            <surname>Kalousis</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Algorithm Selection via Meta-Learning</article-title>
          . Ph.D. University of Geneve, Department of Computer Science.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Jorge</surname>
            <given-names>Kanda</given-names>
          </string-name>
          , Andre de Carvalho, Eduardo Hruschka, Carlos Soares, and
          <string-name>
            <given-names>Pavel</given-names>
            <surname>Brazdil</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Meta-learning to select the best meta-heuristic for the Traveling Salesman Problem: A comparison of meta-features</article-title>
          .
          <source>Neurocomputing</source>
          <volume>205</volume>
          (
          <year>2016</year>
          ),
          <fpage>393</fpage>
          -
          <lpage>406</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Guolin</surname>
            <given-names>Ke</given-names>
          </string-name>
          , Qi Meng, Thomas Finley, Taifeng Wang,
          <string-name>
            <surname>Wei</surname>
            <given-names>Chen</given-names>
          </string-name>
          , Weidong Ma, Qiwei Ye, and
          <string-name>
            <surname>Tie-Yan Liu</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>LightGBM: A Highly Eficient Gradient Boosting Decision Tree</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          , Vol.
          <volume>30</volume>
          . Curran Associates, Inc.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Bingdong</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jinlong</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Ke</given-names>
            <surname>Tang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Xin</given-names>
            <surname>Yao</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Many-objective evolutionary algorithms: A survey</article-title>
          .
          <source>ACM Comput. Surv</source>
          <volume>48</volume>
          ,
          <issue>13</issue>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Wei</surname>
            <given-names>Lu</given-names>
          </string-name>
          , Shanshan Chen,
          <string-name>
            <given-names>Keqian</given-names>
            <surname>Li</surname>
          </string-name>
          , and
          <string-name>
            <surname>Laks</surname>
            <given-names>V.S.</given-names>
          </string-name>
          <string-name>
            <surname>Lakshmanan</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Show me the money: Dynamic recommendations for revenue maximization</article-title>
          .
          <source>In Proceedings of the VLDB Endowment</source>
          .
          <volume>1785</volume>
          -
          <fpage>1796</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Rishabh</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          and
          <string-name>
            <given-names>Benjamin</given-names>
            <surname>Carterette</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Recommendations in a marketplace</article-title>
          .
          <source>In Proceedings of the 13th ACM Conference on Recommender Systems</source>
          .
          <volume>580</volume>
          -
          <fpage>581</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Rishabh</surname>
            <given-names>Mehrotra</given-names>
          </string-name>
          ,
          <string-name>
            <surname>James</surname>
            <given-names>McInerney</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Hugues</given-names>
            <surname>Bouchard</surname>
          </string-name>
          , Mounia Lalmas, and
          <string-name>
            <given-names>Fernando</given-names>
            <surname>Diaz</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Towards a Fair Marketplace: Counterfactual Evaluation of the Trade-of between Relevance, Fairness and Satisfaction in Recommendation Systems</article-title>
          .
          <source>In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM</source>
          ,
          <volume>2243</volume>
          -
          <fpage>2251</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Mustafa</given-names>
            <surname>Mısır</surname>
          </string-name>
          and
          <string-name>
            <given-names>Michèle</given-names>
            <surname>Sebag</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Alors: An algorithm recommender system</article-title>
          .
          <source>Artificial Intelligence</source>
          <volume>244</volume>
          ,
          <issue>244</issue>
          (
          <year>2017</year>
          ),
          <fpage>291</fpage>
          -
          <lpage>314</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Phong</surname>
            <given-names>Nguyen</given-names>
          </string-name>
          , John Dines, and
          <string-name>
            <given-names>Jan</given-names>
            <surname>Krasnodebski</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A multi-objective learning to re-rank approach to optimize online marketplaces for multiple stakeholders</article-title>
          .
          <source>arXiv:1708.00651</source>
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Phong</surname>
            <given-names>Nguyen</given-names>
          </string-name>
          , Jun Wang,
          <string-name>
            <surname>Melanie Hilario</surname>
            , and
            <given-names>Alexandros</given-names>
          </string-name>
          <string-name>
            <surname>Kalousis</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Learning Heterogeneous Similarity Measures for HybridRecommendations in Meta-Mining</article-title>
          .
          <source>IEEE International Conference on Data Mining</source>
          (
          <year>2012</year>
          ),
          <fpage>1026</fpage>
          -
          <lpage>1031</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>T</given-names>
            <surname>Pereira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T</given-names>
            <surname>Cunha</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C</given-names>
            <surname>Soares</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>− cf2vec: Representation Learning for Personalized Algorithm Selection in Recommender Systems</article-title>
          .
          <source>In 2020 International Conference on Data Mining Workshops. IEEE Computer Society</source>
          ,
          <fpage>181</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>John</given-names>
            <surname>Rice</surname>
          </string-name>
          .
          <year>1976</year>
          .
          <article-title>The Algorithm Selection Problem</article-title>
          .
          <source>Advances in Computers</source>
          <volume>15</volume>
          (
          <year>1976</year>
          ),
          <fpage>65</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Mario</surname>
            <given-names>Rodriguez</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Christian</given-names>
            <surname>Posse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Ethan</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Multiple objective optimization in recommender systems</article-title>
          .
          <source>In Proceedings of the 6th ACM Conference on Recommender Systems</source>
          .
          <volume>11</volume>
          -
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>André</given-names>
            <surname>Luis Debiaso Rossi</surname>
          </string-name>
          ,
          <string-name>
            <surname>André Carlos Ponce De Leon Ferreira de Carvalho</surname>
          </string-name>
          , Carlos Soares, and Bruno Feres de Souza.
          <year>2014</year>
          .
          <article-title>MetaStream: A meta-learning based method for periodic algorithm selection in time-changing data</article-title>
          .
          <source>Neurocomputing</source>
          <volume>127</volume>
          (
          <year>March 2014</year>
          ),
          <fpage>52</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Kate</given-names>
            <surname>Smith-Miles</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Cross-disciplinary perspectives on meta-learning for algorithm selection</article-title>
          .
          <source>Comput. Surveys</source>
          <volume>41</volume>
          ,
          <issue>1</issue>
          (Dec.
          <year>2008</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>Carlos</surname>
            <given-names>Soares</given-names>
          </string-name>
          , Pavel B Brazdil, and
          <string-name>
            <given-names>Petr</given-names>
            <surname>Kuba</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>A Meta-Learning Method to Select the Kernel Width in Support Vector Regression</article-title>
          .
          <source>Machine Learning</source>
          <volume>54</volume>
          ,
          <issue>3</issue>
          (
          <year>2004</year>
          ),
          <fpage>195</fpage>
          -
          <lpage>209</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Joaquin</given-names>
            <surname>Vanschoren</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Understanding machine learning performance with experiment databases</article-title>
          .
          <source>Ph.D. Dissertation</source>
          . Katholieke Universiteit Leuven.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>Joaquin</given-names>
            <surname>Vanschoren</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Meta-Learning: A Survey</article-title>
          . CoRR abs/
          <year>1810</year>
          .03548 (
          <year>2018</year>
          ). arXiv:
          <year>1810</year>
          .03548
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Christian</given-names>
            <surname>Von</surname>
          </string-name>
          <string-name>
            <surname>Lücken</surname>
          </string-name>
          , Benjamín Barán, and
          <string-name>
            <given-names>Carlos</given-names>
            <surname>Brizuela</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A survey on multi-objective evolutionary algorithms for many-objective problems</article-title>
          .
          <source>Computational Optimization and Applications</source>
          <volume>58</volume>
          (
          <year>2014</year>
          ),
          <fpage>707</fpage>
          -
          <lpage>756</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Yining</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liwei</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Yuanzhi</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Di</given-names>
            <surname>He</surname>
          </string-name>
          , and
          <string-name>
            <surname>Tie-Yan Liu</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>A Theoretical Analysis of NDCG Type Ranking Measures</article-title>
          .
          <source>In Proceedings of the 26th Annual Conference on Learning Theory (Proceedings of Machine Learning Research</source>
          , Vol.
          <volume>30</volume>
          ). PMLR, Princeton, NJ, USA,
          <fpage>25</fpage>
          -
          <lpage>54</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>David</given-names>
            <surname>Wolpert</surname>
          </string-name>
          and
          <string-name>
            <given-names>William</given-names>
            <surname>Macready</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>No free lunch theorems for optimization</article-title>
          .
          <source>IEEE Transactions on Evolutionary Computation</source>
          <volume>1</volume>
          ,
          <issue>1</issue>
          (April
          <year>1997</year>
          ),
          <fpage>67</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>Yong</given-names>
            <surname>Zheng</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Multi-Stakeholder Recommendations: Case Studies, Methods and Challenges</article-title>
          .
          <source>In Proceedings of the 13th ACM Conference on Recommender Systems. ACM.</source>
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <surname>Aimin</surname>
            <given-names>Zhou</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bo-Yang</surname>
            <given-names>Qu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Hui</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>Shi-Zheng</surname>
            <given-names>Zhao</given-names>
          </string-name>
          ,
          <source>Ponnuthurai Nagaratnam Suganthan, and Qingfu Zhang</source>
          .
          <year>2011</year>
          .
          <article-title>Multiobjective evolutionary algorithms: A survey of the state of the art</article-title>
          .
          <source>Swarm and Evolutionary Computation</source>
          <volume>1</volume>
          (
          <year>2011</year>
          ),
          <fpage>32</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>