=Paper=
{{Paper
|id=Vol-3173/paper6
|storemode=property
|title=Improving Zonal Fairness While Maintaining Efficiency in Rideshare Matching
|pdfUrl=https://ceur-ws.org/Vol-3173/6.pdf
|volume=Vol-3173
|authors=Ashwin Kumar,Yevgeniy Vorobeychik,William Yeoh
|dblpUrl=https://dblp.org/rec/conf/ijcai/KumarV022
}}
==Improving Zonal Fairness While Maintaining Efficiency in Rideshare Matching==
Improving Zonal Fairness While Maintaining Efficiency in Rideshare Matching Ashwin Kumar, Yevgeniy Vorobeychik, William Yeoh 1 1 Washington University in St. Louis, St. Louis, MO, USA 63130 Abstract Order dispatching algorithms, which match passenger requests with vehicles (agents) in ridesharing systems, are able to achieve high service rates (percentage of requests served) using deep reinforcement learning techniques to estimate the relative values of the different combinations of passenger-vehicle matches. While the goal of such algorithms is to maximize the service rate, this may lead to unintended fairness issues (e.g., high disparity between the service rates of geographic zones in a city). To remedy this limitation, researchers have recently proposed deep reinforcement learning based techniques that incorporates fairness components in the value function approximated. However, this approach suffers from the need to retrain should one wish to tune the degree of fairness or optimize for a different fairness function, which can be computationally expensive. Towards this end, we propose a simpler online approach that uses state-of-art deep reinforcement learning techniques and augments their value functions with fairness components during the matching optimization step. As no additional training is needed, this approach can be adapted to use any existing value function approximator and benefits from improved flexibility in evaluating different fairness objectives efficiently. In this paper, we describe several fairness functions that can be used by this approach and evaluate them against existing state-of-the-art deep RL based fairness techniques on standard ridesharing benchmarks. Our experiments show that our fairness functions outperform existing fairness techniques (i.e., it finds matching solutions that result in higher service rates and lower service rate disparity across zones), demonstrating the practical promise of this approach. Keywords Ridesharing, Fairness, Matching, Transportation Introduction On-demand ridesharing has been gaining traction over the past few years as a solution to the growing need for urban mobility. Providing low cost rides to passengers at the expense of small detours, higher earnings for drivers, and a way to reduce the number of vehicles on the streets, it is a solution that benefits everyone. Consequently, there has been work on finding efficient matches of passengers and vehicles to minimize delays and maximize system efficiency for such ecosystems. Recent approaches have used concepts from dynamic programming and deep reinforcement learning to learn policies for matching pools of passenger requests to available drivers, achieving real-time performance capabilities. Some approaches, such as that by Alonso-Mora et al. [1], have also looked at this problem through the lens of managing fleets of high-capacity autonomous vehicles. These advances have led to significant improvements in the performance of these algorithms vis-a-vis the service rate (i.e., the fraction of passenger ATT’22: Workshop Agents in Traffic and Transportation, July 25, 2022, Vienna, Austria © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) requests served out of all requests made) as well as the total delay passengers have to experience [2, 3, 4]. Optimizing for such metrics, however, can have unintended effects on the fairness of such systems. The issue of fairness in ridesharing has been discussed in various contexts, ranging from passenger-side fairness in terms of delay and partner choice, driver-side fairness in terms of equitable earning, and even sub-group level fairness [5, 6] for drivers and passengers. Sühr et al. [7] outlines the three stakeholders in the ridesharing ecosystem: (i) the taxis/drivers, (ii) the customers, and (iii) the platform (Uber, Lyft, etc.). The platform takes on the job of mediating between customers and drivers, a problem that is often cast as an Integer-Linear Program (ILP) maximizing overall driver preferences over incoming requests. This central role situates ridesharing platforms to enforce fairness via balancing it with efficiency. Most fairness research in ridesharing aims to quantify this tradeoff in one way or the other as part of the objective function, by intervening in the matching process through a central mechanism. In our approach, we view the problem in the context of a fleet of autonomous vehicles controlled by the platform. The optimization is formulated as a reinforcement learning problem, drawing from state-of-art techniques [2, 5] to combine the best performing algorithms with fairness approaches. Using zonal fairness as an example of sub-group fairness for passengers, we propose an online approach to incorporate sub-group fairness into ILP-based matching algorithms, without needing to retrain the pre-learned function approximators. Further, we propose designating only a small fraction of the vehicles or requests as “fairness-aware” and postulate that this solution is better at trading off efficiency and fairness compared to applying the same modifications to all vehicles. We perform experiments to demonstrate the efficacy of the approach, and perform a grid search over hyperparameters to qualitatively analyze the fairness-efficiency tradeoff. Our experimental results show that our methods outperforms existing techniques (i.e., it finds matching solutions that result in higher service rates and lower service rate disparity across zones), demonstrating the practical promise of this approach. Specifically, our contributions are as follows: ∙ We develop an online framework that uses off-the-shelf value function approximators and user-defined fairness objectives to trade off efficiency for fairness using state-of-art matching algorithms. ∙ We introduce new metrics for zonal fairness that consider fairness across source-destination zone pairs. ∙ We provide real-time tuneable parameters for changing between different types and degrees of fairness, allowing systems to adapt on the fly, and compare the results to existing benchmarks. Related work Order dispatching in ridesharing: Rideshare-matching has been extensively studied, and researchers have introduced methods that improve the quality of the matches made in terms of increasing the number of requests matched [8, 9], reducing the pickup and detour delays [1, 10], and increasing drivers’ earnings [11]. The complexity of ridesharing algorithms increases with the increase in vehicle capacity and fleet size. As the runtime of real-time algorithms need to be relatively small, most existing work has either considered assigning one request at a time (sequentially) to available drivers for high capacities [9, 12] or assigning all active requests together in a batch for a small capacity [13, 14]. The sequential solution is faster to compute but the solution quality is typically poor [15]. Alonso-Mora et al. [1] proposes integer optimization approaches for assigning all active requests together for high-capacity ridesharing. Shah et al. [2] and Lowalekar et al. [8] further improve these approaches by including information about anticipated future requests while matching current batch of requests to available drivers. Fairness in ridesharing: Researchers have evaluated ridesharing fairness from many view- points. For passengers, there has been work on addressing lack of transparency [16], using game-theoretic approaches to fairness [17], and benefit sharing by ensuring non-increasing disutility [18]. Driver-side fairness has also been explored from the economic perspective, by using a max-min approach to fairness to balance efficiency and fairness [11], and by looking at fairness over longer periods of time by equalizing driver income proportional to the number of hours spent on the platform [7]. Fairness isn’t restricted to monetary benefits, however. Moti- vated by demographic and geographic fairness concerns, recent work formulates a matching problem with parameters to trade profit for fairness in terms of discriminatory cancellations, looking at factors like start/end locations, race, gender, or age of passengers [5] and drivers [6]. A work that is closest to ours is by Raman et al. [19], which looks at disparate treatment of passengers and income disparity amongst drivers. While they also look at geographic zones to quantify fairness for passengers, their approach requires the training of a neural network based value function to include the fairness term in the objective, making it costly to change parameters for fairness. Our approach presents an online way to address this problem, without retraining existing value functions. Further, our approach offers better tradeoffs between efficiency and fairness as compared to the existing approach, and we show this in our empirical evaluation. Matching using Reinforcement Learning A Ridesharing Matching Problem is a tuple 𝑅𝑀 𝑃 = ⟨ℛ, 𝒱, 𝒢, 𝒞⟩ consisting of the set of requests ℛ, vehicles 𝒱, road network 𝒢, and constraints 𝒞. The solution to this problem is an assignment 𝒜 that satisfies the constraints and provides a matching between vehicles and requests. Specifically, a request 𝑟𝑖 = ⟨𝑠𝑖 , 𝑔𝑖 , 𝑡𝑖 ⟩ contains the pickup location 𝑠𝑖 , dropoff location 𝑔𝑖 and the arrival time of the request 𝑡𝑖 . The vehicle state 𝑣𝑖 = ⟨𝑙𝑖 , 𝑝𝑖 , 𝑐𝑖 , 𝑟𝑖 ⟩ includes its location 𝑙𝑖 , current path 𝑝𝑖 , capacity 𝑐𝑖 and the requests it is currently serving 𝑟𝑖 . The street network 𝒢 = ⟨ℒ, ℰ, 𝑐(𝑒)⟩ is a graph containing locations ℒ connected by roads ℰ, with a cost function 𝑐 : ℰ → R+ that defines the cost 𝑐(𝑒) of each edge 𝑒 ∈ ℰ in the graph. Intuitively, it corresponds to the time needed by a taxi to traverse that edge on the graph. The matching algorithm then needs to match requests to vehicles given some time and capacity constraints 𝒞, while optimizing some metric (e.g., maximizing the service rate). We now describe how NeurADP [2], a state-of-the-art deep reinforcement learning based method, solves this problem. (Our approach is based on NeurADP.) NeurADP learns a Value Function Approximator (VFA) for a vehicle’s state, which approximates the expected future value of being in that particular state, using a deep neural network. For each vehicle, it generates a set of feasible trips [1] and score them using the VFA. Then, it uses an Integer Linear Program (ILP) to maximize the cumulative score over all vehicles. Figure 1: The RMP Pipeline In typical reinforcement learning fashion, there is an agent and an environment. The en- vironment simulates the motion of vehicles and the arrival of requests with the passage of time. Each decision epoch, the agent takes the current state of vehicles and pending requests as input and solves the RMP, matching requests to vehicles. The agent in our setting involves the combination of a VFA for each vehicle, combined with the ILP that finds the optimal assignment based on the values. Using the VFA in this fashion to predict expected returns allows it to make non-myopic decisions that eventually improve the performance of the system. The VFA is learned based on TD-learning [20] using experience replay, techniques popularly used in DQNs [21]. We observe transitions from the environment, which are stored in a buffer and later sampled in mini-batches to learn from. All vehicles share the same value function, allowing for efficient reuse of experiences. The training process and network architecture used for the VFA is identical to the one used by Shah et al. [2], and we refer the reader to that paper for more details. We define an action as the matching of a set of requests to a vehicle. If vehicle 𝑣𝑖 services request 𝑟𝑗 , we denote the reward for the vehicle as 𝑅(𝑣𝑖 , 𝑟𝑗 ). If the set of available (feasible) actions for vehicle 𝑣𝑖 is 𝐴𝑖 , then, for each action 𝑎𝑖𝑘 ∈ 𝐴𝑖 , the score is the corresponding immediate rewards obtained for servicing those requests plus the expected future value of the new vehicle state after being assigned those requests: 𝑠𝑐𝑜𝑟𝑒(𝑎𝑖𝑘 ) = 𝑉 (𝑠′ ) + 𝑅(𝑎𝑖𝑘 ) (1) where 𝑉 (·) is the VFA and 𝑠′ is the new state of the vehicle 𝑣𝑖 after accepting all requests 𝑟𝑗 ∈ 𝑎𝑖𝑘 . As a shorthand, overloading some notation, we can write the reward for an action as: ∑︁ 𝑅(𝑎𝑖𝑘 ) = 𝑅(𝑣𝑖 , 𝑟𝑗 ) (2) 𝑟𝑗 ∈𝑎𝑖𝑘 Let 𝑜𝑖𝑘 be an indicator variable for whether action 𝑎𝑖𝑘 was selected for vehicle 𝑣𝑖 . Then, we can write the objective function of the ILP as follows: ∑︁ ∑︁ max 𝑜𝑖𝑘 × 𝑠𝑐𝑜𝑟𝑒(𝑎𝑖𝑘 ) (3) 𝑖∈|𝒱| 𝑎𝑖𝑘 ∈𝐴𝑖 subject to the constraints: ∑︁ 𝑜𝑖𝑘 = 1, ∀𝑖 ∈ |𝒱| (4) 𝑎𝑖𝑘 ∈𝐴𝑖 ∑︁ ∑︁ 𝑜𝑖𝑘 ≤ 1, ∀𝑟𝑗 ∈ ℛ (5) 𝑖∈|𝒱| 𝑎𝑖𝑘 ∈𝐴𝑖 s.t. 𝑟𝑗 ∈𝑎𝑖𝑘 𝑜𝑖𝑘 ∈ {0, 1}, ∀𝑖 (6) Intuitively, the constraints ensure that each vehicle is assigned exactly one action (Eq. 4) and no request is assigned to more than one vehicle (Eq. 5). In each vehicle’s set of available actions, there is always the null action (i.e., accepting no new requests), so that there is always a solution. The final assignment 𝒜 is a concatenation of all vehicle assignments at a given decision epoch. The objective function in NeurADP [2] is to maximize the number of requests served.As the reward for each request is uniform and the VFA’s predictions are a discounted estimate of the number of requests the vehicle will pick up in the future, the efficiency can be measured using the service rate, which is the fraction of requests assigned out of all requests that were made over a given duration. Zonal Fairness As discussed in earlier sections, there has been interest in preserving fairness across sub-groups of the passenger and driver populations. In our approach, we focus on the passengers, looking at service rate fairness across zone pairs for pickup and dropoff locations. The motivation to select this metric is two-fold: (i) It is an intuitive metric and is consistent with other fairness objectives that aims for equity in efficiency across sub-groups; and (ii) It is difficult to get information about race, ethnicity, and other cultural factors from publicly-available ridesharing datasets. Zonal fairness becomes a concern with algorithms like NeurADP [2], which aim to maximize the number of passengers served. In such cases, if most of the demand is concentrated within a certain area, the model will learn to prefer requests to/from that area. Areas with lesser demand are ignored in favor of the overall maximization objective, resulting in the algorithm sending a large proportion of the vehicle population towards regions of high demand. As we discuss in the following sections, small steps towards fairness can go a long way, because serving only a small number of requests in the less-serviced areas can improve the inequity, possibly at only a marginal cost to the overall efficiency of the system. We define 𝑚 zones as disjoint subsets of the locations ℒ in the city graph. Our sub-groups of interest are people moving between pairs of zones. Recent work [19] aims to maximize the minimum service rate by including variance in service rates across different zones in the optimization objective. Specifically, they look at the service rate by source zone, which we denote as 𝑧𝑖 (i.e., the service rate for all requests originating in zone 𝑖). However, this metric does not take into account the destination of the requests, and there may be a disparity between fairness based on source zones and fairness based on zone-pairs, where some source-destination pairs may be severely under-served. To address this limitation, we define zone-pair service rate 𝑧𝑖𝑗 as the service rate for requests originating in zone 𝑖 and ending in zone 𝑗. This gives us a grid Z𝑝 of zone-pair service rates of size 𝑚 × 𝑚, in addition to a vector Z𝑠 of source-zone service rates of size 𝑚. Given these two sub-group statistics, we can compute fairness metrics of interest, namely the minimum service rate (by source zone or zone-pair) and the Gini coefficient for the service rates (also by source zone or zone-pair). We treat each zone/zone-pair as an individual and fairness is defined by how high the minimum service rate is [11] and by how low the Gini coefficient is. We would like to note that these metrics and definitions of fairness are just examples, guided by recent literature, and we do not claim that these are the only metrics that matter. Instead, fairness is nuanced and subjective, and we aim to show how one can improve the fairness for some particular definitions. Optimizing Zonal Fairness The key idea of our approach is that the score function in Equation 1 can be modified to encourage particular request-vehicle matches by adding a bonus to particular vehicles servicing requests that would improve fairness. This bonus can be tuned to suit the need for fairness, and the original score function can be used otherwise. This gives us the flexibility of using the best value functions for general operation to maintain efficiency, and using the bonus to guide the system towards fairer decisions. It is important to note that we do not retrain the value function, only using predictions from an existing one. Specifically, we define two fairness scores for source-zone (𝑓𝑠 ) and zone-pair (𝑓𝑝 ) fairness. ||Z𝑠 ||1 𝑓𝑠 (𝑟) = − 𝑧𝑗 (7) 𝑚 ||Z𝑝 ||1 𝑓𝑝 (𝑟) = − 𝑧𝑗𝑘 (8) 𝑚2 Here, 𝑟 is the request of interest, and 𝑗 and 𝑘 are the source and destination zones of the request. Intuitively, each score captures how much worse the request’s source zone (zone-pair) is, compared to the average score. A positive score means it is worse. For ease of representation, we also overload this function when used for an action 𝑎 as: ∑︁ 𝑓𝑠 (𝑎) = 𝑓𝑠 (𝑟) (9) 𝑟∈𝑎 ∑︁ 𝑓𝑝 (𝑎) = 𝑓𝑝 (𝑟) (10) 𝑟∈𝑎 We propose a few methods that use the fairness score in conjunction with the ILP in different ways to improve fairness. Each approach modifies the score function (Eq. 1), and can be used with either 𝑓𝑝 or 𝑓𝑠 , but we find that 𝑓𝑝 is empirically stronger. Figure 2: Trends for change in service rate, zone-pair Gini and minimum source-zone service rate with changing 𝛼 and 𝛽 for 𝛼-veh. The top left (𝛼 = 0, 𝛽 = 0) indicates the value for unmodified NeurADP. Fairness-aware Vehicles: In this method, we designate a random fraction 𝛼 of the vehicles as “fairness-aware,” and these vehicles receive the fairness bonus. The identity of the fairness-aware vehicles does not change over time. Intuitively, this lets some dedicated vehicles improve fairness while most vehicles continue to improve the efficiency objective. Let 𝒱𝛼 denote the set of fairness-aware vehicles. We then have the following two variants: • 𝛼-fair vehicles (𝛼-Veh): The fairness bonus is added to the existing score function for 𝒱𝛼 with weight 𝛽. In other words, Equation 1 is replaced with: {︃ 𝑉 (𝑠′ ) + 𝑅(𝑎𝑖𝑘 ) + 𝛽𝑓. (𝑎𝑖𝑘 ), if 𝑣𝑖 ∈ 𝒱𝛼 𝑠𝑐𝑜𝑟𝑒(𝑎𝑖𝑘 ) = (11) 𝑉 (𝑠′ ) + 𝑅(𝑎𝑖𝑘 ), otherwise • 𝛼-exclusive-fair vehicles (X𝛼-Veh): 𝒱𝛼 use only the reward and the fairness score, but not the estimated future value from the neural network. Thus, the fairness-aware vehicles are not concerned with maximizing the long-term value, just the fairness. {︃ 𝑅(𝑎𝑖𝑘 ) + 𝛽𝑓. (𝑎𝑖𝑘 ), if 𝑣𝑖 ∈ 𝒱𝛼 𝑠𝑐𝑜𝑟𝑒(𝑎𝑖𝑘 ) = (12) 𝑉 (𝑠′ ) + 𝑅(𝑎𝑖𝑘 ), otherwise Request-based Fairness: While having designated vehicles trying to enforce fairness is one solution, it is possible that those vehicles aren’t in the right place at the right time and are thus unable to help improve the inequity. To combat this, we can instead look at fairness on a per-request basis, adding the fairness bonus to the rewards of certain requests. Thus, vehicles will be incentivized to pick up requests that have the fairness bonus, even if the value function gives it a lower score. For this approach, we modify the reward 𝑅(𝑎𝑖𝑘 ). Let ℛ𝑓 denote the set of requests with the fairness bonus. We then have the following new reward function to replace Equation 2: {︃∑︀ 𝑟 ∈𝑎𝑖 (𝑅(𝑣𝑖 , 𝑟𝑗 ) + 𝛽𝑓. (𝑟𝑗 )) , 𝑟𝑗 ∈ ℛ𝑓 𝑅(𝑎𝑖𝑘 ) = ∑︀ 𝑗 𝑘 (13) 𝑟𝑗 ∈𝑎𝑖 𝑅(𝑣𝑖 , 𝑟𝑗 ), otherwise 𝑘 To select which requests get the fairness bonus, we propose two methods: • 𝛼-subset of requests (𝛼-Req): Using a pair of hyperparameters (𝛼, 𝛽) similar to the two used for fairness-aware vehicles, this approach selects requests by ranking all requests by decreasing 𝑓. (𝑟) and selecting the top 𝛼 fraction of requests. • Positive fairness score requests (+Req): In this approach, after ranking requests (as in 𝛼-Req), we select all the requests that have a positive 𝑓. (𝑟). It amounts to assigning a bonus to all requests going between zones whose service rate is worse than the average service rate. We can vary 𝛼 and 𝛽 to change the degree of fairness we expect to see in the system. Each method may have a different response to the exact values, but in general, we expect larger values of 𝛼 and 𝛽 to improve the fairness objective, while causing a decrease in the overall service rate. Experimental Setup To evaluate the efficacy of the different approaches, we evaluate the performance and fairness metrics after running the matching algorithm over a 24-hour period on the island of Manhattan using demand data from the NY Yellow Taxi dataset [22]. The locations in the road network correspond to street intersections, with edges as roads connecting them. We define zones as all intersections falling within neighborhoods on Manhattan island.1 We use pre-trained NeurADP models [2] as the base value function. We use the method proposed by Raman et al. [19] as a baseline for fairness, using their passenger-side fairness implementation, which we call FairNN. FairNN uses one hyper-parameter (𝜆) to trade off profit and fairness. In the original paper, experiments were performed for 50 and 200 vehicles with a capacity of 4, for 𝜆 ∈ {108 , 109 , 1010 }. We use the same parameter settings for our experiments for a fair comparison to the baseline. However, even with 200 vehicles, the demand from requests saturates the fleet’s capacity. To simulate a more realistic scenario, we run the experiments with 1000 vehicles as well, similar to the work by [2]. For this setting, we also run FairNN with a wider range of 𝜆 (103 − 1011 ), to more clearly see the trend. Note that each setting of 𝜆 requires retraining the value function for FairNN, which is costly. For the methods discussed in this paper (𝛼-Veh, X𝛼-Veh, 𝛼-Req, +Req), we perform a logarith- mic grid-search over the hyperparameters 𝛼 (0-1) and 𝛽(0-50). We run this search using both 𝑓𝑝 and 𝑓𝑠 as the fairness scores, independently. For each setting, we find the overall service rate, the Gini coefficient of the service rate by source zone (Gini(Z𝑠 )) and by zone-pair (Gini(Z𝑝 )), and the minimum service rate for any source zone (min(Z𝑠 )) or zone-pair (min(Z𝑃 )). Experimental Results In this section, we go over the key results from our experiments. The full set of experiments had over 900 runs;2 we thus present selected results that illustrate the overall trend. We discuss 1 https://github.com/erikgregorywebb/nyc-housing/blob/ master/Data/nyc-zip-codes.csv 2 The complete results can be found at https://github.com/ashwinkwashu/ATT22ZonalFairness Table 1 Comparison of different methods, selecting the hyperparameter value that maximizes min(Z𝑝 ). The values in bold are the best in their respective columns. Service Matching Algorithm Min(Z𝑠 ) Gini(Z𝑠 ) Min(Z𝑝 ) Gini(Z𝑝 ) Rate NeurADP 0.8119 0.4595 0.0928 0.0370 0.3102 FairNN (𝜆=107 ) 0.6855 0.6421 0.0690 0.5355 0.0872 𝛼-Veh (𝛼=0.5,𝛽=20) 0.6900 0.6580 0.0124 0.5833 0.0259 X𝛼-Veh (𝛼=0.5,𝛽=0.5) 0.7782 0.7122 0.0227 0.3290 0.1020 𝛼-Req (𝛼=0.2,𝛽=50) 0.7712 0.7323 0.0184 0.5555 0.0534 +Req (𝛽=15) 0.7841 0.7313 0.0209 0.5416 0.0544 +Req (𝛽=2) 0.8146 0.6536 0.0514 0.3414 0.1520 Figure 3: Pareto frontiers for all methods. We use solid lines for runs optimizing source zones fairness, and dotted lines for zone-pair fairness. For Gini, the optimal point is to the top-left (low gini, high service rate). For Min, the optimal point is to the top-right (high min, high service rate). the results from the analysis of the 1000 vehicle case here, though the trends hold for 200 and 50 vehicles as well. Pickup-only vs. pickup/dropoff-pair fairness: Figure 3 shows the Pareto frontiers that represent the overall service rate and fairness tradeoff. Solid lines are associated with our approaches in which groups are defined with respect to pickup-only locations (𝑓𝑠 ), while dashed lines correspond to approaches that define groups based on pickup-dropoff location pairs (𝑓𝑝 ). Our first observation is that in nearly every instance the dashed lines (our approaches using zone-pairs) Pareto dominate solid lines (our approaches using source zones only), even when we use source zones (Z𝑝 ) for evaluation. This observation demonstrates the usefulness of the additional observation provided by the latter in improving geographic fairness, and is particularly noteworthy given that the prior FairNN approach defined zones using only pickup locations. Due to this observation, the remaining analysis focuses on pickup-dropoff variants of our approaches. Baseline comparisons: Figure 3 compares the proposed approaches (consider only the dashed lines henceforth) with two baselines: NeurADP and FairNN. For FairNN, we construct the Pareto frontier by considering many values of 𝜆 and omitting any that were Pareto dominated (in some cases, there was only a single Pareto dominant result for FairNN, in which we plotted just this point). First, observe that our approaches Pareto dominate FairNN by a considerable margin. Thus, in addition to computational benefits (we do not have to retrain the value function), the proposed methods can achieve both higher service rate and greater fairness than FairNN. We also observe that the proposed methods can yield considerable fairness improvement over NeurADP, with relatively minor degradation in service rate. This is further illustrated in Table 1. Consider comparison between NeurADP, FairNN(𝜆 = 107 ), the best parameter value we identified, and +Req(𝛽 = 15) in this table. +Req yields nearly the same overall service rate as NeurADP, but with a dramatic increase in the minimum service rate and a dramatic reduction in the Gini coefficient. It also outperforms FairNN on all of these metrics, in several cases by a large margin. Fairness-efficiency tradeoff: Figure 2 shows how the various metrics change with changing hyperparameters for 𝛼-Veh, using the zone-pair fairness score 𝑓𝑝 . 𝛽 controls the weight of the fairness score and 𝛼 is the fraction of fairness-aware vehicles. We see that as 𝛼 and 𝛽 increase (stronger fairness), overall service rate decreases and the fairness objective used in the fairness score (here, Gini(Z𝑝 )) improves. Other fairness metrics like the min(Z𝑠 ) also improve, but the change is not monotonic, as can be seen in the third heatmap. This trend is seen across all different methods and both fairness scores. This suggests that there is a direct tradeoff between efficiency and the selected fairness metric, and it is possible to use the other metrics to select a suitable value for the hyperparameters. Further, this shows that there is merit in applying the fairness score to a subset of the vehicle population, as the change in service rate along the 𝛽 axis is slower, while still giving gains for fairness. This is also shown by the fact that min(Z𝑠 ) is maximized when half the vehicles are fair. Specifically, it does not lie at 𝛼 = 1. We also observe that there are diminishing returns with respect to the fairness utility as 𝛼 increases. As shown in Figure 2, doubling the number of fair vehicles does not double the fairness gain. Relative performance of our methods: We can observe from Figure 3 that request-based methods (𝛼-Req and +Req) tend to outperform vehicle-based methods (𝛼-Veh and X𝛼-Veh) across the board, with the only exception coming for very low Gini coefficients with respect to pickup-dropoff-based zones (leftmost plot). This is likely because vehicle-based methods require the fairness-aware vehicles to be in the right place in order to have an impact on fairness (e.g., in order for the requests coming from underserved zones to be feasible for these). Request-based methods have comparable performance, but having the additional parameter in 𝛼-Req helps it achieve lower Gini coefficients than +Req. Driver fairness: While fairness in the distribution of driver profits is not our primary focus, it is an important additional consideration in ridesharing. We thus consider how the proposed methods compare to NeurADP and FairNN in terms of the distribution of number of requests Figure 4: Histograms showing the distribution of trips per driver for each case in Table 1, with the y-axis showing the number of drivers in each bin. The red lines represent the average driver trips. served (equivalent to income in our setting, as we assume identical costs per trip for all trips) by the drivers for the scenarios in Table 1. Figure 4 shows the distribution of the trips each driver served in one day. Interestingly, NeurADP exhibits a relatively low variance in the number of requests served, particularly com- pared to FairNN that uses fairness of requests in the objective, perhaps suggesting a significant tradeoff between achieving these two types of equity. What is particularly noteworthy is that our vehicle-based methods yield considerable unfairness in requests served among drivers – not only is variance high, but the distribution itself becomes bimodal. In contrast, our request-based methods exhibit variance comparable to NeurADP, suggesting that it may be possible to achieve high overall service rate, and both a high level of rider and driver fairness. Conclusions As the demand for cutting edge algorithms for urban mobility increases, their effect on the underlying fairness of these systems needs to be studied, and measures taken to ensure that our algorithms do not inherit the implicit biases that result from pure optimization. In this work, we discussed the issue of zonal fairness in ridesharing systems, introducing new fairness metrics for zone-pair fairness. Through four simple methods that build on existing solutions for order dispatching, we showed how simple techniques can be used to trade off efficiency for fairness. Our methods outperform existing approaches to zonal fairness via an online modification to state-of-art techniques in ridesharing, with the added advantage of not needing any extra training and the ability to be used with any VFA. The tradeoffs proposed in this work can be dynamically adjusted to have increased fairness at minimal cost to efficiency. Our experiments showed that it is better to apply fairness incentives to a subset of the driver or request population for maximum results, as opposed to a blanket approach to fairness. Our work leaves some interesting avenues open for future work: ∙ The generalizability of our approach needs to be evaluated. While it can theoretically be applied with any VFA, experiments are needed to verify that. Similarly, this approach needs to be tested with different sub-group fairness measures apart from zonal fairness. ∙ Our approach is limited by the need to find the right hyperparameter setting for a desired fairness level, but this problem is shared across other solutions for fairness. A potential solution to this problem may involve learning to adapt the hyperparameters to the current state of the world. Such an approach would automatically be able to tune the fairness based on the changes in demand and supply. ∙ It is of interest to prove theoretical bounds on the expected gains such an approach can provide, given a particular method to improve fairness. We provide general intuition on why we expect our approaches to work, and to explain the results we see. Future work may find mathematical guarantees that confirm our empirical results. References [1] J. Alonso-Mora, S. Samaranayake, A. Wallar, E. Frazzoli, D. Rus, On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment, Proceedings of the National Academy of Sciences 114 (2017) 462–467. URL: http://dx.doi.org/10.1073/pnas.1611675114. doi:10. 1073/pnas.1611675114. [2] S. Shah, M. Lowalekar, P. Varakantham, Neural approximate dynamic programming for on-demand ride-pooling, Proceedings of the AAAI Conference on Artificial Intelligence 34 (2020) 507–515. URL: http://dx.doi.org/10.1609/aaai.v34i01.5388. doi:10.1609/aaai. v34i01.5388. [3] M. Lowalekar, P. Varakantham, P. Jaillet, Zac: A zone path construction approach for effective real-time ridesharing, in: Proceedings of the International Conference on Automated Planning and Scheduling, volume 29, 2019, pp. 528–538. URL: https: //aaai.org/ojs/index.php/ICAPS/article/view/3519. [4] M. Li, Z. Qin, Y. Jiao, Y. Yang, J. Wang, C. Wang, G. Wu, J. Ye, Efficient ridesharing order dispatching with mean field multi-agent reinforcement learning, in: The World Wide Web Conference, WWW ’19, Association for Computing Machinery, New York, NY, USA, 2019, p. 983–994. [5] V. Nanda, P. Xu, K. A. Sankararaman, J. Dickerson, A. Srinivasan, Balancing the tradeoff between profit and fairness in rideshare platforms during high-demand hours, Proceedings of the AAAI Conference on Artificial Intelligence 34 (2020) 2210–2217. URL: http://dx.doi. org/10.1609/aaai.v34i02.5597. doi:10.1609/aaai.v34i02.5597. [6] Y. Xu, P. Xu, Trade the system efficiency for the income equality of drivers in rideshare, in: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence Organization, 2020. URL: http: //dx.doi.org/10.24963/ijcai.2020/580. doi:10.24963/ijcai.2020/580. [7] T. Sühr, A. J. Biega, M. Zehlike, K. P. Gummadi, A. Chakraborty, Two-sided fairness for repeated matchings in two-sided markets, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, 2019. URL: http://dx.doi.org/10.1145/3292500.3330793. doi:10.1145/3292500.3330793. [8] M. Lowalekar, P. Varakantham, P. Jaillet, Zone path construction (zac) based approaches for effective real-time ridesharing, arXiv:2009.06051 (2020). URL: https://arxiv.org/abs/ 2009.06051. [9] S. Ma, Y. Zheng, O. Wolfson, Real-time city-scale taxi ridesharing, IEEE Transactions on Knowledge and Data Engineering 27 (2015) 1782–1795. URL: http://dx.doi.org/10.1109/ TKDE.2014.2334313. doi:10.1109/tkde.2014.2334313. [10] Y. Huang, F. Bastani, R. Jin, X. S. Wang, Large scale real-time ridesharing with service guarantee on road networks, Proceedings of the VLDB Endowment 7 (2014) 2017–2028. URL: http://dx.doi.org/10.14778/2733085.2733106. doi:10.14778/2733085.2733106. [11] N. S. Lesmana, X. Zhang, X. Bei, Balancing efficiency and fairness in on- demand ridesourcing, in: H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché- Buc, E. Fox, R. Garnett (Eds.), Advances in Neural Information Processing Systems 32, Curran Associates, Inc., 2019, pp. 5309–5319. URL: http://papers.nips.cc/paper/ 8772-balancing-efficiency-and-fairness-in-on-demand-ridesourcing.pdf. [12] Y. Tong, Y. Zeng, Z. Zhou, L. Chen, J. Ye, K. Xu, A unified approach to route planning for shared mobility, Proceedings of the VLDB Endowment 11 (2018) 1633–1646. [13] X. Yu, S. Shen, An integrated decomposition and approximate dynamic programming approach for on-demand ride pooling, IEEE Transactions on Intelligent Transportation Systems (2019). [14] L. Zheng, L. Chen, J. Ye, Order dispatch in price-aware ridesharing, Proceedings of the VLDB Endowment 11 (2018) 853–865. [15] Uber, Uber matching solution, https://marketplace.uber.com/matching, 2018. [16] O. Wolfson, J. Lin, Fairness versus optimality in ridesharing, in: 2017 18th IEEE In- ternational Conference on Mobile Data Management (MDM), IEEE, 2017. URL: http: //dx.doi.org/10.1109/MDM.2017.25. doi:10.1109/mdm.2017.25. [17] L. Foti, J. Lin, O. Wolfson, Optimum versus nash-equilibrium in taxi ridesharing, GeoInformatica (2019). URL: http://dx.doi.org/10.1007/s10707-019-00379-6. doi:10.1007/ s10707-019-00379-6. [18] R. Gopalakrishnan, K. Mukherjee, T. Tulabandhula, The costs and benefits of ridesharing: Sequential individual rationality and sequential fairness, CoRR abs/1607.07306 (2016). URL: http://arxiv.org/abs/1607.07306. arXiv:1607.07306. [19] N. Raman, S. Shah, J. Dickerson, Data-driven methods for balancing fairness and efficiency in ride-pooling, in: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 2021. [20] R. S. Sutton, Learning to predict by the methods of temporal differences, Machine learning 3 (1988) 9–44. [21] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, Playing atari with deep reinforcement learning, arXiv preprint arXiv:1312.5602 (2013). [22] NYC Taxi & Limousine Commission, Tlc triprecord data - tlc, 2020. URL: https://www1. nyc.gov/site/tlc/about/tlc-trip-record-data.page.