=Paper=
{{Paper
|id=Vol-1623/paperme2
|storemode=property
|title=Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games
|pdfUrl=https://ceur-ws.org/Vol-1623/paperme2.pdf
|volume=Vol-1623
|authors=Po-An Chen
|dblpUrl=https://dblp.org/rec/conf/door/Chen16
}}
==Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games==
<pdf width="1500px">https://ceur-ws.org/Vol-1623/paperme2.pdf</pdf>
<pre>
     Generalized Mirror Descents with Non-Convex
    Potential Functions in Atomic Congestion Games

                                            Po-An Chen

                              Institute of Information Management
                             National Chiao Tung University, Taiwan
                                     poanchen@nctu.edu.tw


       Abstract. When playing specific classes of no-regret algorithms (especially,
       multiplicative updates) in atomic congestion games, some previous convergence
       analyses were done with a standard Rosenthal potential function in terms of
       mixed strategy profiles (probability distributions on atomic flows), which may
       not be convex. In several other works, the convergence analysis was done with
       a convex potential function in terms of nonatomic flows as an approximation
       of the Rosenthal one in terms of distributions. It can be seen that though with
       diﬀerent techniques, the properties from convexity help there, especially for
       convergence time.
       However, it would be always a valid question to ask if convergence can still be
       guaranteed directly with the Rosenthal potential function, playing mirror de-
       scents individually in atomic congestion games. We answer this aﬃrmatively by
       showing the convergence, individually playing discrete mirror descents with the
       help of the smoothness property similarly adopted in many previous works for
       congestion games and Fisher (and some more general) markets and individu-
       ally playing continuous mirror descents with the separability of regularization
       functions.


1    Introduction

Playing learning algorithms in repeated games has been extensively studied within this
decade, especially with generic no-regret algorithms [3, 7] and various specific no-regret
algorithms [8, 9, 4, 5, 10]. Multiplicative updates are played in atomic congestion games
to reach pure Nash equilibria with high probability with full information in [8], and in
load-balancing games to converge to certain mixed Nash euiqlibria with bulletin-board
posting in [9]. The family of mirror descents [1], which include multiplicative updates,
gradient descents, and many more classes of algorithms, are generalized and played with
bulletin-board posting, and even with only bandit feedbacks in (respectively, nonatomic
and atomic) congestion games to guarantee convergence to approximate equilibria [4,
5].
    For specific classes of no-regret algorithms (especially, multiplicative updates), the
analysis in [8, 10] was done with a standard Rosenthal potential function in terms

Copyright ⃝
          c by the paper’s authors. Copying permitted for private and academic purposes.
In: A. Kononov et al. (eds.): DOOR 2016, Vladivostok, Russia, published at http://ceur-ws.org
                   Generalized Mirror Descents with Non-Convex Potential Functions          559

of mixed strategy profiles (probability distributions on atomic flows), which may not
be convex. In [9, 4, 5] (even with mirror descents, more general than multiplicative
updates), the convergence analysis was done with a convex potential function in terms
of nonatomic flows as an approximation of the Rosenthal one in terms of distributions.1
It can be seen that though with diﬀerent techniques, the properties from convexity and
others help there, especially for convergence time.
    However, it would be always a valid question to ask if convergence can still be
guaranteed directly with the Rosenthal potential function, playing mirror descents in-
dividually in atomic congestion games. In this paper, we answer this aﬃrmatively by
showing the convergence using discrete generalized mirror descents with the help of the
smoothness property similarly adopted in [2, 6, 4, 5] and using continuous generalized
mirror descents with the separability of regularization functions as in [6].


2     Preliminaries
We need to formally define the game and potential function before we proceed. We
consider the following atomic congestion game, described by (N, E, (Si )i∈N ,
(ce )e∈E ) is the set of players, E is the set of m edges (resources), Si ⊆ 2E is the collection
of allowed paths (subsets of resources) for player i, and ce is the cost function of edge
e, which is a nondecreasing function of the amount of flow on it. Let us assume that
there are n players, each player has at most d allowed paths, each path has length at
most m, each allowed path of a player intersects at most k allowed paths (including
that path itself) of that player, and each player has a flow of amount 1/n to route.
     The mixed strategy of each player i is to send her entire flow on a single path,
chosen randomly according to some distribution over her allowed paths, which can
be represented by a |Si |-dimensional vector pi = (piγ )γ∈Si , where piγ ∈ [0, 1] is the
probability of choosing path γ. It turns out to be more convenient for us to represent
each player’s strategy pi by an equivalent form xi = (1/n)pi , where 1/n is the amount
of flow each player has. ∑    That is, for every i ∈ N and γ ∈ Si , we have that xiγ =
(1/n)piγ ∈ [0, 1/n] and γ∈Si xiγ = 1/n. Let Ki denote the feasible set of all such
xi ∈ [0, 1/n]|Si | for player i, and let K = K1 × ... × Kn , which is the feasible set of all
such joint strategy profiles (x1 , ..., xn ) of the n players.
     Γi is a random subset of resources, and Γ−i is a vector (Γi )i∈N except Γi . We
consider the following Rosenthal potential function in terms of mixed strategy profile
p ([8, 10]).

                       Ψ (p) = EΓ−i [Φ−i (Γ−i )] + E(Γ−i ,Γi ) [ci (Γi )]                   (1)
                                        ∑ K∑
                                           i (e)

                             ≡ EΓ−i [              ce (j)] + E(Γ−i ,Γi ) [ci (Γi )],        (2)
                                      e∈E j=1

where Ki (e) is a random variable defined as Ke (Γ−i ) ≡ |{j : j ̸= i, e ∈ Γj }|. Let Γi
have value γ, which is a strategy for i, with probability piγ , and E(Γ−i ,Γi ) [ci (Γi )] is
1
    There is a tradeoﬀ of an error from the nonlinearity of cost functions if the implication of
    reaching approximate equilibria is needed besides the convergence guarantee ([9, 5]).
560     Po-An Chen

defined as follows.
                                                         ∑
                              E(Γ−i ,Γi ) [ci (Γi )] =          piγ ciγ ,                  (3)
                                                         γ∈Si

where the expected individual cost for player   ∑i choosing γ over the randomness from
the other players is ciγ = EΓ−i [ci (γ, Γ−i )] = e∈γ EΓ−i [ce (1 + Ki (e))].
   We have the following properties.
                                     ∂Ψ (p)
                                             = ciγ ,
                                       ∂piγ
                                  ∂ 2 Ψ (p)      ∑ ∂ciγ
                                             =                .
                                 ∂piγ ∂pjγ ′         ′
                                                       ∂pjγ ′
                                                 e∈γ∩γ


3     Dynamics and Convergence
Discrete Generalized Mirror Descents
We consider the following discrete update rule for player i.
                      pt+1
                       i   = arg min {ηi ⟨(ctiγ )γ , zi ⟩ + B Ri (zi , pti )}              (4)
                                   zi ∈Ki

                            = arg min B Ri (zi , pti − ηi ∇i Ψ (pt ))                      (5)
                                   zi ∈Ki

The equality in (5) holds because (ctiγ )γ = ∇i Ψ (pt ). Here, ηi > 0 is some learning rate,
Ri : Ki → R is some regularization function, and B Ri (·, ·) is the Bregman divergence
with respect to Ri defined as
                  B Ri (ui , vi ) = Ri (ui ) − Ri (vi ) − ⟨∇Ri (vi ), ui − vi ⟩
for ui , vi ∈ Ki .
    We can ask about the actual convergence of the value of Ψ to some minimum and
thus an approximate equilibrium and/or aim for convergence of the mixed strategy
profile pt. The diﬃculty to deal with such Ψ is that it is not convex. There may be
no way to bound the convergence time, and there can be multiple minima to converge
to. Nevertheless, with the “smoothness” as defined in [2, 6] and [4, 5], restated in the
following, it is still possible to show convergence of Ψ and approximate equilibria in
atomic congestion games.
Definition 1 (Smoothness). We say that Ψ is λ-smooth with respect to (R1 , ..., Rn )
if for any two inputs p = (p1 , ..., pn ), p′ = (p′1 , ..., p′n ) ∈ K,
                                                                 ∑
                                                                 n
                  Ψ (p′ ) = Ψ (p) + ⟨∇Ψ (p), p′ − p⟩ + λ               B Ri (p′i , pi ).   (6)
                                                                 i=1

Note that λ has to be properly set in diﬀerent games.
The convergence follows from the assumption that for each i, ηi ≤ 1/λ and thus
λ ≤ 1/ηi , and the implication from the λ-smoothness condition (7).
Theorem 1. For any t ≥ 0, Ψ (pt+1 ) ≤ Ψ (pt ).
                 Generalized Mirror Descents with Non-Convex Potential Functions             561

Continuous Generalized Mirror Descents
                                                                              ∑
We consider the case where Ri is a separable function, i.e., it is of the form γ∈Si Ri (uiγ ).
The continuous update rule with respect to B Ri (·, ·) is defined as follows. Let
                                                            1
                     pi (ϵ) = arg min {ηi ⟨(ctiγ )γ , zi ⟩ + B Ri (zi , pti )}.               (7)
                                  zi ∈Ki                    ϵ
                      dpi t
                                   pi (ϵ) − pi
                                             t
                            = lim              .                                              (8)
                       dt     ϵ→0        ϵ
The following claim and the convexity of Ri give the convergence result.
                                                  dptiγ −∇iγ Ψ (pt )            t
                                                                                      ∑           t
Claim ( lemma 4.2 of [6]). For any i and γ,        dt = Ri′′ (ptiγ ) where Ri (pi ) =   γ∈Si Ri (piγ ).

                                      t
Theorem 2. For any t ≥ 0, dΨdt
                            (p )
                                 ≤ 0.


4    Discussion and Future Work
More challengingly, we can also consider partial-information models with such trickier
potential function Ψ . Does the bandit algorithm in [5] extend here to guarantee conver-
gence? Can the gradient still be estimated accurately (close to the real one with high
probability)? Are there other suitable gradient estimation methods as well that work
in other less or more stringent partial-information models other than the bandit one?
    Another direction is application to the analysis of the average-case price of anarchy
as the application of “pointwise” convergence for the average-case price of anarchy in
[10].


References
 1. Amir Beck and Marc Teboulle. Mirror descent and nonlinear projected subgradient meth-
    ods for convex optimization. Operations Research Letters, 31(3):167–175, 2003.
 2. B. Birnbaum, N. Devanur, and L. Xiao. Distributed algorithms via gradient descent for
    fisher markets. In Proc. 12th ACM Conference on Electronic Commerce, 2011.
 3. A. Blum, E. Even-Dar, and K. Ligett. Routing without regret: On convergence to nash
    equilibria of regret-minimizing algorithms in routing games. In Proc. 25th ACM Sympo-
    sium on Principles of Distributed Computing, 2006.
 4. P.-A. Chen and C.-J. Lu. Generalized mirror descents in congestion games with splittable
    flows. In Proc. 13th International Conference on Autonomous Agents and Multiagent
    Systems, 2014.
 5. P.-A. Chen and C.-J. Lu. Playing congestion games with bandit feedbacks (extended
    abstract). In Proc. 14th International Conference on Autonomous Agents and Multiagent
    Systems, 2015.
 6. Y. K. Cheung, R. Cole, and N. Devanur. Tatonnement beyond gross substitutes?: gradient
    descent to the rescue. In Proc. 45th ACM Symposium on theory of computing, pages 191–
    200, 2013.
 7. E. Even-dar, Y. Mansour, and U. Nadav. On the convergence of regret minimization dy-
    namics in concave games. In Proc. 41st Annual ACM Symposium on Theory of Computing,
    2009.
562     Po-An Chen

 8. R. Kleinberg, G. Piliouras, and E. Tardos. Multiplicative updates outperform generic
    no-regret learning in congestion games. In Proc. 40th ACM Symposium on Theory of
    Computing, 2009.
 9. R. Kleinberg, G. Piliouras, and E. Tardos. Load balancing without regret in the bulletin
    board model. Distributed Computing, 24, 2011.
10. I. Panageas and G. Piliouras. From pointwise convergence of evolutionary dynamics to av-
    erage case analysis of decentralized algorithms. 2015. https://arxiv.org/abs/1403.3885v3.

</pre>