=Paper= {{Paper |id=Vol-2129/paper11 |storemode=property |title=Incentivizing Rider Time-Shift in a Multi-Leg Public Transportation System |pdfUrl=https://ceur-ws.org/Vol-2129/paper11.pdf |volume=Vol-2129 |authors=Megan Shearer,Michael P. Wellman |dblpUrl=https://dblp.org/rec/conf/ijcai/ShearerW18 }} ==Incentivizing Rider Time-Shift in a Multi-Leg Public Transportation System== https://ceur-ws.org/Vol-2129/paper11.pdf
     Incentivizing Rider Time-Shift in a Multi-Leg Public Transportation System

                                        Megan Shearer and Michael P. Wellman
                                               University of Michigan
                                           {shearerj,wellman}@umich.edu



                          Abstract
     We develop an incentive scheme for a hub-to-
     shuttle campus transit system, encouraging riders
     to shift travel times to improve system efficiency.
     Riders are modeled as agents with loss functions
     in order to evaluate social welfare in the incentive
     scheme. Incentives are calculated based on a pre-
     dictive wait-time model trained via Gaussian pro-
     cess regression. Experiments show that the sys-
     tem’s total loss decreases, but the average wait time
     increases. This leads to the conclusion that there is
     room for improvement in the predictive wait time
     model.
                                                                    Figure 1: This heatmap of ridership on the University of Michigan
                                                                    campus displays the uneven distribution of riders in the system. The
1   Introduction                                                    green areas represent spots where more riders want to begin a bus
                                                                    trip, while the blue area contains fewer riders.
In most public transportation systems, particularly bus sys-
tems, uneven ridership leads to an inefficient use of resources.
There are geographic areas and times within a traffic system
that require different resources, but in a bus system the same
vehicles travel the same routes throughout the day. The goal
of the RITMO project [Maheo et al., 2015, To appear] is to
make public bus systems more efficient. This new system
uses shuttles for segments of bus routes with minimal riders,
allowing buses to focus on high-volume routes.
   The RITMO project aims to improve the bus system at the
University of Michigan, by using this hub-to-shuttle system
to address uneven ridership. Figure 1 shows the uneven de-
mand across the campus bus system map. The idea of the
hub-to-shuttle system is to connect the populous green areas
with buses, and to use shuttles to drive riders in the sparse
blue areas. It would be inefficient to have a single shuttle ser-
vice all blue areas on the map, therefore the serviceable area
is broken into zones, which can be seen in Figure 2. A shuttle
will not leave its assigned zone, but rather pickup and deliver     Figure 2: The University of Michigan bus system is broken up into
riders from hubs whom wish to travel in multiple zones.             zones, each containing a hub for buses to run between. Shuttles are
   Along with some physical routes getting significantly more       assigned zones to travel less populous routes.
attention, some temporal routes getting greater traffic volume
causes problems with inefficiency. While it is inevitable that
there will be uneven ridership throughout the day in a tran-        convince them to move to spread traffic. In this paper we de-
sit system, this paper addresses the problems created by the        fine a wait time to be the time difference between when a rider
same amount of resources being deployed throughout the day          requests to be picked up and when she actually boards her first
by developing a solution to spread out traffic. Wait times are      vehicle. However, in a stochastic transit system like RITMO,
assumed to be important to riders and thus a useful tool to         wait times cannot be computed directly at the time the system
would offer an incentive. In this paper, an incentive is defined   examine a similar problem of tolls for individuals and road
as an arbitrary point system that can easily be expanded to        traffic to achieve a socially optimal solution [de Palma et al.,
a real quantity. For example, on the Michigan campus meal          2005, Joksimovic et al., 2014, Li, 2018, Bui et al., 2012].
points can be allocated to students that alter their requests         Literature on pricing ride sharing systems strongly consid-
times. We specifically use incentives, rather than tolls, be-      ers surge pricing, or pricing during peak travel times [Baner-
cause the RITMO system is for a free campus bus system.            jee et al., 2015]. This work studies a two-sided system be-
Our goal is to spread riders out over time to reduce the cost      cause drivers are also part of the pricing scheme, compared
of the system and lower wait times to increase efficiency and      to the RITMO system where drivers are not compensated for
social welfare. Social welfare is the aggregate loss of riders     individual trips. There is a study on pricing multi-leg ride
in the system, plus the cost of providing incentives to riders.    sharing [Teubner and Flath, 2015], but their pricing scheme
   The main contribution of this paper is the creation of an       relies solely on trip distance and does not study traffic or in-
agent-based incentive scheme which utilizes a learned pre-         centivize riders to change requests.
dictive wait time model. The incentive schemes aims to in-            Studies using machine learning to predict the wait time for
centivize riders entering the system to change their departure     public transportation riders mainly focus on arrival of buses
time by small intervals. This helps alleviate the number of        on their predetermined routes [Chien et al., 2002, Yu et al.,
riders at a time, while not reducing the satisfaction of rid-      2011, 2007]. Idé and Kato [2009] use Gaussian process re-
ers, because they still reach their destination on time. The       gression to predict the travel time of vehicles in a traffic net-
incentive scheme models the riders as agents, then uses loss       work.
functions for the system and agents, and the expected wait
time to calculate an incentive for each agent to change her        3   Incentive Model
trip starting time, or board time. The incentive model then        To reduce the number of riders entering the system at high
decides if an agent should be offered an incentive based on        traffic times, an incentive model aims to incentivize some rid-
if the system deems it socially optimal for the agent to move      ers to change their requested board times by small time inter-
her travel time. The incentive model is judged by its effects      vals. A rider enters the RITMO system by providing a request
on the system’s cost and social welfare.                           to the scheduler on an app interface. Without an incentive
   The creation of a learned predictive wait-time model part-      model, a scheduler immediately processes a riders requests
nered with an agent-based incentive model has previously not       by finding a trip for the rider, which is illustrated in Figure 3.
been used in transportation pricing. When we first created the     Scheduling the requested trip is necessarily beneficial to other
incentive model and did not have a good wait time predictor,       riders in the system, so we introduce an incentive model to at-
then agents changed their board times to time intervals that       tempt to increase the social welfare of the system by lowering
potentially increased their wait times, which decreased the        the aggregate loss of the riders. Figure 4 illustrates the pro-
satisfaction of riders and increased costs to the system. As the   cess of how a scheduler uses the incentive model to handle
predicted wait time became more accurate the satisfaction and      a rider’s request. This request contains a rider’s desired trip
cost of system improved. Therefore, the design of the predic-      information, which inform the scheduler how much the rider
tive wait time model became integral to the success of the         values different aspects of the trip, for instance any aversion
incentive model. Developing a predictive wait time model is        about early arrival at destination. After receiving a request,
not straightforward in the RITMO problem space because the         the scheduler decides whether or not to provide an incentive
shuttles’ paths are volatile. Our predictive wait time model       to convince the rider to change her board time. This decision
uses Gaussian Process Regression [Pedregosa et al., 2011,          is based on expected wait times for the rider and similar rid-
Rasmussen, 2004] to find expected wait times for agents be-        ers. If the scheduler decides not to offer an incentive, then the
fore their trips are scheduled, based on current traffic infor-    original request processes by scheduling the trip. If the sched-
mation. We evaluate the wait-time model on the accuracy of         uler offers an incentive, then the rider decides if she wants to
its predictions, and whether it helps reduce the average wait      take the incentive and change her board time, or reject the
times in the system.                                               incentive and keep her original board time. Once the rider
                                                                   makes her decision, the scheduler respects the decision and
2   Related Work                                                   continues with processing the request.

This work is part of the RITMO project, a hub-and-shuttle
public transportation project initiated by Pascal Van Henten-
ryck [Maheo et al., 2015, To appear]. The hub-and-shuttle
system is the basis of this paper, but previous work from Van
Hentenryck focuses on vehicle routing optimization, while
our work focuses on improving social welfare at high rider-
ship times by incentivizing riders to change their travel plans.   Figure 3: Without an incentive model, riders enter the system and
   Literature on dynamic transportation pricing mostly in-         send their desired trip to the scheduler. The scheduler immediately
volves finding optimal prices for lanes or roads to reach an       processes requests and schedules the route and vehicles.
equilibrium for drivers [Sharon et al., 2017, Yin and Lou,
2009, Zhang et al., 2014]. These attempt to change routes            Given the information in a request, we model the riders
rather than departure times. Studies more similar to ours          as agents with loss functions. An agent’s loss increases as
                                                                           a time-shift. The scheduler updates agent xi ’s request and
                                                                           continues with scheduling her ride.
                                                                           3.2   Details of Interaction
                                                                           Request
                                                                           Agents’ requests consist of a set of public and private param-
                                                                           eters that the scheduler uses to assign routes and incentives.
                                                                           The public parameters within an agent’s request are a trip ID
                                                                           (i.e., an identifier for agent xi ∈ X where X is the set of all
                                                                           agents), the requested board time lower bound ri , the pickup
                                                                           stop, and the delivery stop. The private parameters are her
Figure 4: Riders enter the system and send their desired trip to the
                                                                           willingness to leave early αi , willingness to leave late βi , and
scheduler. It considers a request and uses the predictive wait time
model to decide if offering an incentive is socially optimal. If it will   malleability to incentives γi , which determines how much xi
not benefit the system for a rider’s board time to change, then the        appraises an incentive in proportion to how the system val-
it schedules her ride. Otherwise, it offers her an incentive, which        ues it. Her wait time is the difference between her requested
she uses the predictive wait time model to decide if her personal loss     board time lower bound and her actual pickup time.
lowers. She then returns her decision, and her trip is scheduled.
                                                                           Agents’ Loss Function
                                                                           Agents have a loss function to evaluate a ride. The elements
wait time increases, and decreases with the addition of an in-             of this loss function consist of the travel time, and the effects
centive. Though every time an agent takes an incentive, the                of changing the request time if the agent chooses to change
system endures the cost of that incentive. The entire system               her board time lower bound. The travel time is broken down
is better off when the agents’ losses in aggregate are lower.              into the wait time and the time an agent spends on the actual
Therefore, the total loss of this system is calculated by the              trip. We assume agents want lower wait times. Therefore the
sum of agent loss plus the incentive costs. This total loss can            wait time is beneficial in calculating an incentive. We assume
be thought of as the negative of social welfare, because the               that loss increases if the agent does not leave at her intended
agents on average are better off with a lower system loss. An              time, but an incentive counteracts this by decreasing loss.
individual agent’s assumed goal is to minimize her own loss.                  Agent xi ’s loss function is formally defined as:
The goal of the scheduler is to maximize social welfare by
                                                                            La (xi ) = τ (pi ) + αi (ri − pi )Ie + βi (pi − ri )I` − γi ψi . (1)
minimizing the total loss of the system.
                                                                           The parameters are revealed to the scheduler in xi ’s request.
3.1    Communication Between Scheduler and Agent                           pi is a board time lower bound proposed to agent xi by the
An agent wishes to minimize her own loss and to reach her                  scheduler, where pi ∈ {ri − t, ri , ri + t}. τ (pi ) is the trip
destination. The scheduler wants to minimize the net-loss of               time at pi , where trip time is calculated by the time from pi
the system and get all agents to their desired destinations. The           to when the agent completes her trip. Ie and I` are indicator
scheduler breaks a day into t-minute time intervals as a way               functions where Ie = 1 when ri > pi and Ie = 0 otherwise,
to measure congestion at different points. Both the agents and             i.e. when the proposal time is earlier than the request time,
scheduler are aware of traffic in the market.                              and I` = 1 when ri < pi and I` = 0 otherwise.
   An agent, xi , initiates an interaction by sending a request               The scheduler and agents cannot know exact travel times
with public and private parameters. We assume the scheduler                until all rides are assigned trips, so the estimated agent loss is
has full information from agent xi ’s request, that is, the re-            found by:
quest the scheduler receives reflects all private information.
Given the request, the scheduler decides if the agent should                L̂a (xi ) = E[τ (pi )]+αi (ri −pi )Ie +βi (pi −ri )I` −γi ψi . (2)
be offered an incentive ψi to move the agent’s board time                  The expected trip time, E[τ (pi )] = E[w(pi )] + E[vi ].
by t. We calculate the incentive with agent xi ’s private loss             E[w(pi )] is the predictive wait time model’s estimate at pi ,
function, using values in xi ’s request. Once the scheduler                and E[vi ] is an optimization of the minimum trip distance.
finds ψi , it looks at congestion and a threshold for incentives.          The calculation of the estimated wait time is discussed in sec-
Congestion is determined by the effects of an agent on the                 tion 4.
average wait time at different time intervals.
   If the scheduler determines that ψi should be given to xi               System’s Loss Function
then it returns her proposal consisting of ψi and a new board              The system also endures a loss from each agent and the in-
time. If the scheduler does not think that an incentive should             centive it pays her. Therefore, the system’s loss is defined
be offered to xi , then it continues on with scheduling xi ’s              as the aggregate agents loss, plus the cost of the incentives.
ride. If the agent is offered an incentive, then she considers             Since the system appraises the incentive at the its true value,
her potential loss with the proposed board time and her orig-              the system’s loss from a rider is defined as:
inal requested board time. The agent chooses the board time                                    Ls (xi ) = La (xi ) + ψi .                   (3)
that minimizes her loss. Since the scheduler knows the pri-
vate parameters of the agent, it can choose an incentive that              The scheduler uses an estimate of (3) to assess the effect an
it knows will be sufficient to convince an agent to agree to               agent will have on the system, because the agent’s trip is
not scheduled when it uses the loss to calculate an incentive.
Thus, it must use the predictive wait time model to calculate
the expected travel time. The estimation the scheduler uses:
                    L̂s (xi ) = L̂a (xi ) + ψi ,               (4)
  After the termination of the simulation, we evaluate the
system by summing agent loss (3),
                          X
                     L=       Ls (xi ).                (5)
                              i=1

Incentive
Assuming the scheduler knows each agent’s private parame-             Figure 5: The scheduler decides to offer an agent an incentive if it
ters, it can find L̂a (xi ), ∀i, and use this to minimize the sys-    will positively affect social welfare. To make this decision, it finds
                                                                      the average wait time of the desired time interval and those pre-
tem’s loss. We find the minimum incentive for xi by solving:          ceding and following it; this average wait time is defined as E[W ].
                                                                      Here it offers an incentive to change the board time to ri −t, because
                    E[τ (ri )] − L̂a (xi ) = 0.                (6)
                                                                      min E[W ] = 1.1.
E[τ (ri )] is the estimate loss of at ri . We solve for ψi in (6):
        1                                                       
ψi =        E[τ (pi )]+αi (ri −pi )Ie +βi (pi −ri )I` −E[τ (ri )] .
        γi
                                                                (7)
If the scheduler offers ψi found in (7), then agent xi is guar-
anteed to do as well as her original request time. However,
it is most likely suboptimal to move all agents, so the sched-
uler decides which agents to incentivize the impact on other
agents with similar requests. If the scheduler determines
moving xi to have a positive affect on other agents, then ψi
is offered to xi , in exchange for xi to change her request time
to pi .
                                                                      Figure 6: An agent determines whether or not to take an incentive by
Scheduler Decision                                                    assessing her personal loss. She uses the predictive wait time model
Once the scheduler calculates an agent’s required incentive, it       in order to calculate her expected loss, L̂. Here she will choose to
must decide whether to offer this incentive to the agent. The         take the incentive because min L̂ = 9.7.
scheduler’s goal is to maximize social welfare, so it offers xi
an incentive if moving xi positively affects the other agents in
aggregate. We determine this using the wait-time estimator to         3.3    Agent Manipulation
find the wait time with and without the agent at the requested        We also explore the impacts on the incentive model when
time and time intervals around it. We assume that an agent            agents try to manipulate the system. Agent manipulation de-
affects only those requesting trips at the same stop and time         fines an agent attempting to lower her own loss by giving the
interval. Therefore, we calculate the average estimated wait          scheduler an untruthful board time lower bound. Before an
time of the original time interval and the intervals around it.       agent submits her request, she is assumed to have full knowl-
To find the socially optimal option, we find the average esti-        edge of the system and can determine if she would be in-
mated wait when xi is in each time interval. Whichever pro-           centivized to change an untruthful request time to her true
vides the lowest average wait time is the interval the scheduler      desired time. This decision happens before a request is sent,
incentivizes agent xi to move her request time. This decision         and how it fits into the incentive model is shown in Figure 7,
is illustrated in Figure 5. We assume lower average wait time         and details of her decision are illustrated in Figure 8. A ma-
leads to a socially better outcome, because more agents with          nipulative agent is only untruthful about her requested board
requests similar to xi will be better off since a lower average       time, and the other parameters in her request are truthful. The
wait time yields a lower average loss.                                scheduler is unaware that an agent is untruthful, and assumes
Agent Decision                                                        that the distribution of traffic throughout the system remains
An agent’s goal is to minimize her own personal loss. There-          the same as if all agents were truthful.
fore, she will take an incentive if her overall loss lowers when
she moves her request time to the proposed time. How she              3.4    Evaluation
makes this decision is illustrated in Figure 6. An agent does         The incentive model is evaluated on its impacts on social
not particularly care about if she enters at rush hour or not,        welfare. Our goal is to minimize the system loss (5). This
but rather if she has a low wait time and reaches her destina-        equation is the sum of the agents’ loss and the incentives pro-
tion on time. Choosing the board time lower bound with the            vided to them. Therefore, in minimizing this equation, we
lowest loss assures that she achieves these goals.                    also minimize the agents’ loss, while not providing unafford-
                                                                        nonparametric model to predict output values. It works by
                                                                        measuring the distance between points using a kernel func-
                                                                        tion, and constructing predictions based on a distance-based
                                                                        weighting of those points. The training is governed by a set
                                                                        of hyper-parameters [Rasmussen, 2004].
                                                                           We chose GPR for its superior performance using the fea-
                                                                        tures available in the stochastic RITMO system. GPR train-
                                                                        ing requires time cubic in samples, which limited the size of
                                                                        training sets we could employ in experiments. We use a ra-
                                                                        dial basis kernel function, k(x, x0 ) = cexp(− 2`1
                                                                                                                           (x − x0 )2 ),
                                                                        with white noise. We assume that small changes in features
Figure 7: This depicts the same incentive model as Figure 4, except     causes only small changes in a wait time, which makes the
agents are now trying to manipulate the system. Agents now use          smoothing quality of the radial basis kernel ideal. All of the
the predictive wait model to determine if they should submit an un-
                                                                        features are standardized during training, and data is split be-
truthful requested board time to the scheduler. The incentive scheme
proceeds in the same manner as Figure 4.                                tween testing and training sets.

                                                                        4.2   Feature Selection
                                                                        All of the selected features are independent of an agent’s pri-
                                                                        vate variables and pertain to the current state of riders, vehi-
                                                                        cles, and stop network: number of agents expected to request
                                                                        board times in the same zone and time interval; distance of
                                                                        the predicted closest vehicle; whether requested board stop is
                                                                        a hub; whether the agent’s pickup and drop-off stops are in
                                                                        different zones; whether the agent is at a hub and changing
                                                                        zones (i.e., waiting for a bus rather than shuttle).

                                                                        4.3   Training Loop
Figure 8: To decide if she should lie about her desired requested
board time, an agent wants to know if she will be incentivized to       The training loop begins by running the incentive model once
change her board time to her desired time. She uses the predictive      to generate features and wait times used to train the predic-
wait time model to determine her effects on social welfare, which are   tive wait time model. The training loop and final model se-
calculated the same way as Figure 5, with average wait time E[W ].      lection is illustrated in Figure 9. Once data is collected from
If she finds that the system will be worse off if she has a different   this initial simulation run, it is split into testing and training
board time, then she changes her request. Here she changes her
                                                                        data. The training data is used to train the Gaussian process
request to ri + t because max E[W ] = 1.4.
                                                                        regression model. The trained model’s generated parameters
                                                                        are plugged into the simulation, and used to calculate the ex-
able incentives. Minimizing the agents’ loss should result in           pected wait time on future runs. Next, the simulation runs
maximizing the social welfare. The baseline for total system            again with a smaller proportion of agents offered incentives.
cost is the value of (5) when no incentive scheme is used with          This proportion is determined by offering incentives only to
the model.                                                              agents that the scheduler is sure with benefit the system. The
                                                                        scheduler determines which agents will benefit other agents
4     Predictive Wait Time Model                                        more by altering their requests in Section 3.2. Then the same
                                                                        data extraction and training process are repeated. The data is
The goal of the incentive scheme is to convince an agent to             extracted and trained until are desired proportions are tested.
alter her request before the actual trip takes place. To deter-         After this training loop completes, all wait time models are
mine an incentive to provide the agent, the scheduler uses (7).         used on the incentive model when either the corresponding
Thus at the creation time of the request, the scheduler needs           proportion of agents are offered incentives, or no agents are
to know the wait time at the requested board time.                      offered incentives. The chosen final model is the combination
   Finding the wait time at a future time step is nontrivial            of the predictive wait time and incentive models that yields
because the state of the system can change. In the interim,             the lowest total system loss, L, found by (5).
other agents may enter and alter a shuttle’s path, potentially
affecting the predicted route of an agent at her creation time.
                                                                        4.4   Evaluation
Shuttles may also leave and enter the system in a stochastic
manner.                                                                 The predictive wait time model is evaluated on the accuracy
                                                                        of its expected wait times and its effects on the actual wait
4.1    Gaussian Process Regression                                      time. The model is therefore evaluated by its success in low-
We estimate wait times using a model trained with Gaussian              ering aggregate wait times. The baseline for this comparison
process regression (GPR). GPR generates a nonlinear and                 is the system with no incentive scheme.
                                                                       10 different random seeds to produce 95% confidence inter-
                                                                       vals on the following graphs. Lastly, the baseline is defined
                                                                       as the total loss or wait time of the system when no incentives
                                                                       are offered.

                                                                       5.3   Incentive Model Results
Figure 9: The model begins by generating data by running the incen-    The incentive model is evaluated by the total system loss for
tive scheme once. Then the wait time model trains using this data,     non-manipulative and manipulative agents. The goal is to
and this trained model is saved for later. Next the model is trained   lower these from their initial baseline values. The results of
with a smaller proportion of agents offered incentives. Next these     the total system loss with the predictive wait time model on
wait time models are used in the incentive model. The final model      the RITMO and perturbed data are presented in Figures 10
is chosen by the minimum system loss, L.                               and 11, respectively.

5     Results
5.1    Data
We use a RITMO dataset provided by the RITMO project in
our experiments. This dataset represents bus traffic on the
University of Michigan campus. University bus drivers man-
ually recorded the number of riders that boarded a bus at each
stop three different months, which then was turned into a rep-
resentative distribution of agent requests. There are also sets
of shuttles, buses, and stops to create the University of Michi-
gan RITMO system.
   To model loss, we added personal preference data ran-
domly to each agent. These preferences are not meant to
be a realistic model, but rather reasonable values for eval-
uation. Using uniform distributions, we generated prefer-              Figure 10: The total loss of the system increases if we offer more
ences, αi , βi ,∼ U [0, 0.1] and γi ∼ U [0.5, 1.5], which spec-        agents incentives. The bottom figure is a closer look at the total
                                                                       loss when a smaller proportion is offered incentives. The total loss
ify, respectively, willingness to travel early and late, and           lowers, but not by a significant amount. The loss slightly rises when
how the rider values an incentive proportionally to the sys-           agents are manipulative.
tem. We also perturbed the baseline request set to test the
incentive and predictive wait time models on different traf-
                                                                          Figure 10 shows for non-manipulative and manipulative
fic spreads on the Michigan campus. Specifically, we added
                                                                       agents the total system loss initially decreases by an insignif-
noise ∼ U [−20, 20] minutes to each rider’s request-creation
                                                                       icant amount from the baseline value, depicted by the black
time and requested board time.
                                                                       line, then dramatically increases. The difference in propor-
5.2    Environment Settings                                            tion of agents offered incentives is determined by a threshold
                                                                       placed on the certainty of the scheduler that moving an agent
We consider the RITMO and perturbed datasets, which con-               will benefit the system. Therefore, when more agents are of-
tain 37,724 rider requests in the University of Michigan bus           fered incentives, there is a lower threshold on the certainty
system. This dataset takes place over a 22 hour period. The            of the scheduler, and the maximum proportion in Figure 10
system is split into three zones, with between 17 and 31 shut-         occurs when any agent may potentially benefit the system by
tles in each zone. There are four bus lines with a total of 17         moving. The result that the loss increases as more agents
buses.                                                                 are offered incentives demonstrates an inaccuracy in the sys-
   For both the incentive and predictive wait time models, we          tem, because ideally the scheduler should not be offering that
vary the proportion of agents offered incentives to determine          many agents, if any at all, incentives that would cause the to-
if this affects the total loss and average wait time. We do            tal loss to increase. This is mainly an issue because the goal
this by only offering agents whom the scheduler is more con-           of the system is to lower the loss of the agents in aggregate, a
vinced will benefit the system by moving. This is determined           goal which is clearly not being achieved. The loss slightly in-
by setting a limit on the change in predicted average wait             creases when agents are manipulative, but not by a significant
time, and the scheduler will only offer incentives to agents           amount compared to non-manipulative agents.
that it is more convinced will benefit the other agents around            Figure 11 shows the total loss of the perturbed data set.
them. We then evaluate the system by the total loss and av-            The perturbed data yields a similar result to RITMO data, but
erage wait time at the different proportion of agents offered          when a lower proportion of agents are offered incentives the
incentives.                                                            total loss lowers by a significant amount. One more thing
   In the incentive model we test all non-manipulative agents,         to note is that the confidence intervals are tighter on the per-
and all manipulative agents. We train the predictive wait time         turbed data compared to the RITMO data, and there is less
model on a set of 1,000 requests. Each experiment is run with          variation between the cases with and without manipulation.
This is most likely caused by the method of perturbation
making the traffic spread more stable and have slightly lower
peaks at high traffic times.




                                                                       Figure 13: The average wait time is higher for all tested proportions
                                                                       of agents offered incentives, even though the total loss lowers in
                                                                       Figure 11. The average wait time remains the same when agents are
                                                                       manipulative.
Figure 11: The total loss lowers by a significant amount when in-
centives are offered to a smaller proportion of agents, and the loss
is higher than the baseline when more agents are offered incentives.   are from problems with the predictive wait time model. If
The loss remains the same when agents are manipulative.                the predictive wait time model accurately calculated expected
                                                                       wait times for agents, then the average wait time would lower
                                                                       when agents took incentives, and as a result the total loss
5.4   Predictive Wait Time Model Results                               would lower as well. It also shows that the incentive model
                                                                       is behaving how we would expect because there is a direct
The predictive wait time model is evaluated by the average             relationship between the average wait and total loss. There-
wait time at the end of the simulation. The results of the av-         fore, we can come to the conclusion that to improve the entire
erage wait time for the RITMO and perturbed data are shown             model, we need to improve the predictive wait time model.
in Figures 12 and 13, respectively.
                                                                          The average wait time of the perturbed data is shown in
                                                                       Figure 13. Like the RITMO data, the average wait time shows
                                                                       a similar pattern as the total loss. While the increase in wait
                                                                       time as the proportion of agents that take an incentive in-
                                                                       creases is similar, the average wait time is always lower when
                                                                       agents take incentives. This shows that the predictive wait
                                                                       time model works better on the perturbed data, which is most
                                                                       likely a result of the stabilization of the perturbed data dis-
                                                                       cussed in the previous section. Therefore there is potential in
                                                                       the incentive scheme and predictive wait time model, but we
                                                                       should proceed in attempting to improve the predicted wait
                                                                       time model for all datasets.

                                                                       6    Conclusion
                                                                       We developed an incentive scheme for the RITMO transit sys-
                                                                       tem with a predictive wait-time model. The incentive model
Figure 12: On the RITMO dataset, the average wait time shows the       aims to minimize system loss and improve social welfare,
same patterns as the total loss in Figure 10. The average wait time    which is defined as the aggregate loss of agents in the sys-
is lower when a smaller proportion of agents are offered incentives,   tem. In experiments with non-manipulative agents, the total
and the average wait time increases by a small amount when agents      system loss is successfully lowered compared to the loss in
are manipulative.                                                      an unincentivized model. However, the average wait time is
                                                                       consistently higher for all incentive schemes. Therefore, the
   Figure 12 shows the average wait for the RITMO data and             predictive wait time model can be improved.
training sets of non-manipulative and manipulative agents.                The main contributions of this work are the creation of
This shows the exact same pattern as Figure 10, demon-                 the agent-based incentive model and learned predictive wait
strating that the loss and average wait times are directly re-         model. Using learned wait times to calculate incentives for
lated. This allows us to conclude that the inaccuracies with           agents in a stochastic system is an innovative idea because
the scheduler’s decision that an agent will benefit the system         riders in a transportation system value their time, and thus
want lower wait times. This idea could expanded to a useful       W. Li. Accounting for travel time reliability, trip purpose and
tool for dynamic pricing in any stochastic road network.            departure time choice in an agent-based dynamic feedback-
   One potential way to improve the predictive wait time            control toll pricing approach. 2018.
model is to look at traffic at an individual stop, rather than    A. Maheo, P. Kilby, and P. Van Hentenryck. Benders decom-
the entire zone. We could also switch from Gaussian pro-            position for the design of a hub and shuttle public transit.
cess regression to a deep neural network, which could better        Eprint arXiv:1601.00367, 2015.
estimate the unknown, underlying structure of the RITMO
system. Given the relationship in experiments between the         A. Maheo, P. Kilby, and P. Van Hentenryck. Benders decom-
mean squared error and average wait times, improving the            position for the design of a hub and shuttle public transit
predictive wait time model will greatly improve the success         system. Transportation Science, To appear.
of the incentive model. Other extensions are to not assume        S. Mouthuy, F. Massen, Y. Deville, and P. Van Hentenryck.
the scheduler knows private information, and to assume an            A multi-stage very large-scale neighborhood search for the
estimated distribution of riders, rather than the exact number       vehicle routing problem with soft time-windows. Trans-
of riders. These extensions would make this model more de-           portation Science, pages 223–238, 2015.
ployable.                                                         F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
                                                                    B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
References                                                          R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-
                                                                    napeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-
T. Arda, Y. Crama, D. Kronus, T. Pironet, and P. Van Henten-
                                                                    learn: Machine learning in Python. Journal of Machine
   ryck. Multi-period vehicle loading with stochastic release
                                                                    Learning Research, 12:2825–2830, 2011.
   dates. EURO Journal on Transportation and Logistics, 3:
   93–119, 2014.                                                  V. Pillac, M. Cebrian, and P. Van Hentenryck. A column-
                                                                    generation approach for joint mobilization and evacuation
S. Banerjee, R. Johari, and C. Riquelme. Pricing in ride-           planning. CPAIOR, pages 285–303, 2015.
  sharing platforms: A queueing-theoretic approach. In 16th
  ACM Conference on Economics and Computation, page               V. Pillac, P. Van Hentenryck, and C. Even. A conflict-based
  639, 2015.                                                         path generation heuristic for evacuation planning. Trans-
                                                                     portation Research Part B, pages 136–150, 2016.
K. T. Bui, V. A. Huynh, and E. Frazzoli. Dynamic traffic
  congestion pricing mechanism with user-centric considera-       C. E. Rasmussen. Gaussian processes in machine learning.
  tions. In 15th International IEEE Conference on Intelligent       Advanced Lectures on Machine Learning, pages 63–71,
  Transportation Systems, pages 147–154, 2012.                      2004.
                                                                  J. Romanski and P. Van Hentenryck. Benders decomposition
S. I.-J. Chien, Y. Ding, and C. Wei. Dynamic bus arrival
                                                                     for perspective evacuation planning. In 13th AAAI Confer-
  time prediction with artificial neural networks. Journal of
                                                                     ence on Artificial Intelligence, 2016.
  Transportation Engineering, 128:429–438, 2002.
                                                                  G. Sharon, J. Hanna, T. Rambha, M. Levin, M. Albert,
A. de Palma, M. Kilani, and R. Lindsey. Congestion pricing          S. Boyles, and P. Stone. Real-time adaptive tolling scheme
  on a road network: A study using the dynamic equilibrium          for optimized social welfare in traffic networks. AAMAS,
  simulator METROPOLIS. Transportation Research Part                2017.
  A: Policy and Practice, 39:588–611, 2005.
                                                                  T. Teubner and C. M. Flath. The economics of multi-hop ride
D. Guimarans, D. Harabor, and P. Van Hentenryck. Simu-               sharing. Business & Information Systems Engineering, 57:
  lation and analysis of container freight train operations at       311–324, 2015.
  Port Botany. Eprint arXiv:1512.03476, 2015.
                                                                  Y. Yin and Y. Lou. Dynamic tolling strategies for managed
T. Idé and S. Kato. Travel-time prediction using gaussian pro-     lanes. Journal of Transportation Engineering, 135, 2009.
   cess regression: A trajectory-based approach. In 9th SIAM
   International Conference on Data Mining, pages 1183–           B. Yu, Z. Yang, and B. Yao. Bus arrival time prediction us-
   1194, 2009.                                                      ing support vector machines. Journal of Intelligent Trans-
                                                                    portation Systems, 10:151–158, 2007.
D. Joksimovic, M. Bliemer, and P. Bovy. Optimal toll design
                                                                  B. Yu, W. H. Lam, and M. L. Tam. Bus arrival time prediction
  problem in dynamic traffic networks with joint route and
                                                                    at bus stop with multiple routes. Transportation Research
  departure time choice. Transportation Research Record:
                                                                    Part C: Emerging Technologies, 19:1157–1170, 2011.
  Journal of the Transportation Research Board, 1923:588–
  611, 2014.                                                      G. Zhang, Y. Wang, H. Wei, and P. Yi. A feedback-based
                                                                    dynamic tolling algorithm for high-occupancy toll lane op-
K. Kumar, J. Romanski, and P. Van Hentenryck. Optimizing            erations. Transportation Research Record: Journal of the
  infrastructure enhancements for evacuation planning. In           Transportation Research Board, 2065, 2014.
  13th AAAI Conference on Artificial Intelligence, 2016.
E. Lam and P. Van Hentenryck. A branch-and-price-and-
  check model for the vehicle routing problem with location
  resource constraints. Constraints, 3:394–412, 2016.