Numerical Simulation of a Game-Theoretic
 Model of Environmental Pollution Problem ? ??

    Marianna Troeva[0000−0001−7046−620X] and Vassili Lukin[0000−0001−8608−9402]

                  North-Eastern Federal University, Yakutsk, Russia
                       troeva@mail.ru, lukinvs14@mail.ru


        Abstract. In this paper, we present a numerical simulation of the game-
        theoretic model of an environmental pollution problem. This model is
        formalized by a noncooperative two-person differential game in Banach
        space with separated dynamics of the agents and continuous payoff func-
        tions depending on a game trajectory. The numerical simulation is based
        on the dynamic programming approach and the finite difference method.
        Some numerical results are provided for two-dimensional dynamic con-
        flict model of an environmental pollution problem.

        Keywords: Noncooperative differential games · Numerical simulation ·
        Environmental pollution.


1     Introduction

Application of the game-theoretic approach plays a special role in mathematical
modeling of environmental problems. The dynamic games that deal with similar
or related problems were investigated in many works, see e.g. [3–6, 9–12, 15] and
references therein.
    In [14], the existence of ε-Nash equilibrium in a dynamic conflict model of
an environmental pollution problem which is formalized by the noncooperative
n-person differential game in Banach space was proved.
    In this paper, we present a numerical simulation for the two-dimensional dy-
namic conflict model of an environmental pollution problem. Enterprises (agents)
contaminate a water reservoir by dumping a pollutant (harmful substance) of
the same type during the production process.
    This model is formalized by a noncooperative two-person differential game
in Banach space with separated dynamics of the agents and continuous payoff
functions depending on a game trajectory. For simplicity, we consider the case of
separated dynamics of agents, in which the dynamics of each agent is described
by the initial boundary value problem for the parabolic equation involving Dirac
?
   The work is supported by the Ministry of Science and Higher Education of the
   Russian Federation, supplementary agreement No. 075-02-2020-1543/1, April 29,
   2020.
??
   Copyright c 2020 for this paper by its authors. Use permitted under Creative Com-
   mons License Attribution 4.0 International (CC BY 4.0).
        Numerical simulation of a game-theoretic model of pollution problem        227

measure. The existence of ε-Nash equilibrium in our model follows from the
results of [14].
   The paper gives a formal mathematical description of the model and related
numerical method and presents simulation results.


2    The Model
A closed water reservoir (e.g. lake) is considered. Two enterprises dump a pol-
lutant (harmful substance) of the same type into this water reservoir during the
process of production. Both enterprises dump the pollutant at some prescribed
points of the reservoir.
    Furthermore, it is assumed that the reservoir has a water intake. The level
of pollution at the water intake point (the total concentration of the pollutant
released by all enterprises) must not exceed the maximum permissible value.
It is assumed that if this value is exceeded, all the enterprises will pay a fine
(penalty) as a percentage of their income. It is also assumed that the company
has certain expenses associated with the cleaning of the pollutant.
    The spread of the harmful substances in the reservoir occurs by diffusion.
Besides, a pollutant is decomposed with the rate r > 0.
    The total income of an enterprise depends on its volume of production, which
is tightly linked with its total volume of dumped pollutant. Besides, the total
income depends on the overall cleaning expenses and possible pollution fines.
The aim of each enterprise is to maximize the total income for finite period of
time.


3    Differential Game
We will study the differential two-person game Γ (c0 , T ) with a prescribed dura-
tion T < ∞ and an initial position of the game c0 . Let I = {i} = {1, 2} be a set
of the agents (enterprises).
    We will consider the closed water reservoir as a rectangle Ω = [0, d1 ] × [0, d2 ].
    Let z i (x1 , x2 , t) be the pollutant concentration of the agent i at the point
(x1 , x2 ) ∈ Ω at the moment t ∈ [0, T ]. Denote by (x̄i1 , x̄i2 ) ∈ Ω the prescribed
point of dumping the pollutant of the agent i ∈ I and by (xw , yw ) the coordinates
of the water intake location inside the domain Ω.
    Let us denote by ui (t) the intensity of dumping the pollutant of the agent
i ∈ I at the moment t. We assume that the intensity of dumping the pollutant
satisfies the following conditions:

                             0 ≤ ui (t) ≤ Gi (t),   i ∈ I,                         (1)

at any moment t ∈ [0, T ]. Here Gi (t) > 0 is a given square integrable function
which describes the maximal intensity of dumping the pollutant of the agent i at
the moment t. Let us assume that the production expenditures per unit product
of the enterprise i are constant and equal to Mi > 0, i ∈ I.
228      M. Troeva, V. Lukin

   The dynamics of the agent i = 1, 2 in the game Γ (c0 , T ) is described by
the initial boundary value problem for the following differential equation on the
domain Ω = (0, d1 ) × (0, d2 ) :

            ∂z i    ∂2zi  ∂2zi
                 = D 2 + D 2 − rz i + ui ψi (x1 , x2 ),                  x ∈ Ω, t > 0.               (2)
            ∂t      ∂x1   ∂x2

Here D > 0 is the diffusion coefficient; r > 0 is the coefficient characterizing the
pollution decomposition; ui = ui (t) is a control function of the agent i, satisfying
the condition (1). The function ψi (x1 , x2 ) = δ(x1 − x̄i1 , x2 − x̄i2 ) gives the location
of the agent i inside the domain Ω.
    Let the function z i (x1 , x2 , t) satisfies the following boundary conditions of
impenetrability:

                          ∂z i
                               = 0,       x1 = 0, x2 ∈ [0, d2 ], t > 0,                              (3)
                          ∂x1
                          ∂z i
                               = 0,       x1 = d1 , x2 ∈ [0, d2 ], t > 0,                            (4)
                          ∂x1
                          ∂z i
                               = 0,       x1 ∈ [0, d1 ], x2 = 0, t > 0,                              (5)
                          ∂x2
                          ∂z i
                               = 0,       x1 ∈ [0, d1 ], x2 = d2 , t > 0,                            (6)
                          ∂x2
and the following initial condition:

             z i (x1 , x2 , 0) = ci0 (x1 , x2 ),   x1 ∈ [0, d1 ], x2 ∈ [0, d2 ], t = 0,              (7)

where ci0 (x1 , x2 ) is some given function describing the initial distribution of the
pollutant concentration of the agent i in the water reservoir at the initial moment
t = 0.

Definition 1. A measurable function ui = ui (t), satisfying the condition (1)
for all t ∈ [0, T ] is called the admissible control of the agent i ∈ I. Let us denote
by U i ⊂ Lp (0, T ), i ∈ I, the set of admissible controls (measurable functions)
ui (t), t ∈ [0, T ].

    Let the function fi (t) ≥ 0 for all t ∈ [0, T ] determine the amount of the
penalty of the agent i for exceeding the maximum permissible value of pollution
at the water intake point (xw , yw ) as follows:
                                                                        2
       
                                                                              z j (xw , yw , t) ≤ Cw ,
                                                                        P
       
       
        0,                                                        if
                                                                        j=1
       
       
       
       
                               2
                                   z j (xw , yw , t) − Cw
                               P
fi (t)=
           z i (xw , yw , t)   j=1                                      2
                                                                              z j (xw , yw , t) > Cw ,
                                                                        P
                             ·
       
       
         2
                                                          ,        if
       
        P      j
                                          Cw                            j=1
        z (xw , yw , t)
       
       
         j=1
        Numerical simulation of a game-theoretic model of pollution problem            229

where Cw is the maximum permissible value of pollution at the water intake
point.
    Denote by qi = qi (t) the volume of production of the agent i at the moment
t. Let P̄i > 0 be the price of the product of the agent i.
    Assume that the intensity of dumping the pollutant ui (t) linearly depends
on the production volume of the agent i:

                                   ui (t) = αqi (t),   α > 0.

   Let p̄ > 0 be the payment for the discharge of a unit of pollutant. Then the
payoff of the agent i at time T is defined by the following functional:

                              ZT                                  ZT
              Hi (z, ui ) =        Pi ui (τ )(1 − pfi (τ ))dτ −        Mi ui (τ )dτ,    (8)
                              0                                   0

where z = (z 1 , z 2 ), Pi = P̄i /α, p = p̄/Pi , Mi > 0 is the production expenditures
per unit product of the enterprise i. The goal of the agent i is to maximize Hi (·).
    Game Γ (c0 , T ) is a particular case of a noncooperative n-person differential
game in Banach space that was studied in [14]. Below, for the sake of complete-
ness, let us recall the main results of [14].
    The dynamics of the agent i ∈ I = {i} = {1, . . . , n} in the n-person differ-
ential game Γ (c0 , T ) is described by the initial boundary value problem for the
following differential equation:

               ∂z i                       ∂z i                    ∂z i
                                                                    
                          ∂                        ∂
                      =        D(x, y, t)        +     D(x, y, t)        −
                ∂t       ∂x               ∂x       ∂y             ∂y
                      −rz i + ui ψi (x, y),       (x, y) ∈ Ω ∈ R2 , t > 0.              (9)

Here D(x, y, t) > 0 is the diffusion coefficient; r > 0 is the coefficient char-
acterizing the pollution decomposition; ui ∈ Ui is a control parameter of the
agent i, Ui ⊂ Rmi is a compact set in Euclidean space. The function ψi (x, y) =
δ(x − xi , y − yi ) gives the location of the agent i inside the domain Ω.
    Let the function z i (x, y, t) satisfies the following boundary and initial condi-
tions:
                               ∂z i
                    D(t, x, y)      = 0,      (x, y) ∈ S, t ∈ [0, T ],            (10)
                               ∂m
                    z i (x, y, 0) = ci0 (x, y),    (x, y) ∈ Ω,         t = 0,          (11)
where m is an outward normal to the boundary surface S ×[0, T ], ci0 (x, y) is some
given function describing the initial distribution of the pollutant concentration
of the agent i in the water reservoir at the initial moment t = 0.
    Let us represent the problem (9)–(11) as the initial-value problem for the
following operator-differential equation

                        dci (t)
                                − A(t)ci (t) = ν i (t), t ∈ [0, T ],                   (12)
                          dt
230      M. Troeva, V. Lukin


                                         ci (0) = z0i = ci0 ,                                   (13)
where ci (t) = z i (x, y, t), ν i (t) = ui (t) · δ(x − xi , y − yi ). The operator Ac =
∂x (D(t, x, y)cx ) + ∂y (D(t, x, y)cy ) − rc allows for the boundary condition (10).
    The equation (12) involves the Dirac measure. The existence of a unique
solution of abstract parabolic evolution equations involving Banach space-valued
Radon measures is proved in [1].
    We assume that the coeffitients D and r in (9)–(11) satisfy the following con-
ditions D(t, x, y) ∈ C([0, T ]; C 1 (Ω)), r ∈ L∞ (0, T ; Lq (Ω)). According to results
of [1], the unique solution ci ∈ Lp (0, T ; Wp1 (Ω)), cit ∈ Lp (0, T ; (Wq1 (Ω))0 ), i ∈ I
of the problem (12)–(13) exists for all ν i ∈ Lp (0, T ; (Wq1 (Ω))0 ), for all admissible
                                                                                  2/p−1
control ui ∈ U i ⊂ Lp (0, T ), and for all initial condition ci0 ∈ (Wq             (Ω))0 and
q > 2 (1/p + 1/q = 1).
     Let us denote by Fi (ci0 , t0 , t) the set of the points ci (·) ∈ Wp1 (Ω) for which
there exists an admissible control ui (t) such that the game goes from the state
ci (t0 ) = ci0 to the state ci (t + t0 ) for the time interval [t0 , t]. The set Fi (ci0 , t0 , t)
is a bounded set of the space Wp1 (Ω). It is known [7] that if the boundary of the
domain Ω is smooth then a bounded set of the space Wp1 (Ω) is a compact set in
                                                                                       2/p−1
Lp (Ω). This implies that Fi (ci0 , t0 , t) is a compact set for all ci0 ∈ (Wq         (Ω))0 ,
                                                                    2/p−1
t0 , t ∈ [0, T ] as well. Fi (ci0 , t0 , t0 ) = ci0 for all ci0 ∈ (Wq     (Ω))0 , t0 ∈ [0, T ].
                 i
The set Fi (c0 , t0 , t) has semigroup property.
     The set Fi (ci0 , t0 , t) is called the attainability set of the player i, i = 1, n
from the initial state ci0 on the time interval [t0 , t].
     Let us denote by Fbi (ci0 , t0 , t), i ∈ I the set of trajectories ĉi (ci0 , t − t0 ) of
(12)–(13) which start at ci0 at the moment t0 and which are defined on the time
interval [t0 , t]. The set of trajectories Fbi (ci0 , t0 , t) is compact e.g. in the space
Lp (0, T ; Wp1−s (Ω)) for any s > 0, and the function Fbi (ci0 , t0 , t) is continuous in
the corresponding Hausdorff metric.
     At every moment t ∈ [0, T ] of the game Γ (c0 , T ) the agents know the realized
trajectory of the game, the dynamics and the duration T of the game.
     Let ĉi (·) ∈ Fbi (ci0 , 0, T ) be the trajectory of (12)–(13) arising from a control
ui and Πδi (ĉi ) be the trajectory arising from the same control ui delayed by δT .
The following lemma describes the relation between these trajectories.

Lemma 1. For each δ ∈ (0, 1] there exists a map Πδi : Fbi (ci0 , 0, T ) → Fbi (·) such
that, if ĉi (τ ) = ĉ0i (τ ) for τ ∈ [0, t], then Πδi (ĉi )(τ ) = Πδi (ĉ0i )(τ ) for τ ∈ [0, t+δT ]
if (t + δT ) ≤ T and Πδi (ĉi )(τ ) = Πδi (ĉ0i )(τ ) for τ ∈ [0, T ] if (t + δT ) > T .
Moreover,
                               εi (δ) = sup kĉi − Πδi (ĉi )k −→ 0.
                                                                   δ→0
                                        ĉi ∈F
                                             bi (·)

   Let us fix the permutation p = (i1 , . . . , ik , . . . , in ) and consider n-person
multistep game Γpδ (c0 , T ) at every step which the agents i1 , . . . , in choose in
sequence controls ui1 , . . . , uik , . . . , uin .
         Numerical simulation of a game-theoretic model of pollution problem               231

Definition 2. The strategy
                                                    Y
                           δ
                               ϕpik : Fbi∗k (·) =           Fbj (·) → Fbik (·),
                                                    j6=ik


of the agent ik in the game Γpδ (c0 , T ) is a mapping such that if ĉj (τ ) = ĉ0j (τ )
for j < ik , τ ∈ [0, lδT ] and if ĉj (τ ) = ĉ0j (τ ) for j > ik , τ ∈ [0, (l − 1)δT ], then
                               0
δ p
 ϕik (ĉ∗ik (τ )) = δ ϕpik (ĉ∗ ik (τ )), τ ∈ [0, lδT ]. Here δ = 1/2N , l = 1, 2, . . . , 2N .

    Let us denote by δ Φpik the set of the strategies of the agent ik in the game
Γp (c0 , T ). In the game Γpδ (c0 , T ) the players i1 , ..., in choose in sequence the
  δ

strategies δ ϕpi1 , . . . , δ ϕpin . The trajectory χ(δ ϕp ) is uniquely defined for every n-
tuple δ ϕp = (δ ϕpi1 , . . . , δ ϕpin ) stepwise on successive intervals [0, δT ], . . ., [T −
δT, T ]. The payoff function of the agent i ∈ I in the game Γpδ (c0 , T ) is defined
as follows:
                                  Hiδ (c0 , δ ϕp ) = Hi (χδ (δ ϕp )),                     (14)
here Hi (·) is the functional of the same kind as (8).
    So, the n-person differential game Γpδ (c0 , T ) with the prescribed duration T
is defined in a normal form:

                            Γpδ (c0 , T ) = hI, {δ Φpi }n1 , {Hiδ }n1 i.

   Using the Zermelo-Neumann theorem, the existence of ε-equilibrium for any
ε > 0 in the multistep game Γpδ (c0 , T ) can be proved.
   The previous Lemma 1 implies the following lemma.
                                                                              pi
Lemma 2. If ik > i1 , δ ϕpik ∈ δ Φpik , then Πδik · δ ϕpik ∈ δ Φikk , where pik = (ik , pe),
pe is a permutation of the set I \ ik ; moreover, for b   c∗ik ∈ Fbi∗k (·)

                    kδ ϕpik (b
                             c∗ik ) − (Πδik · δ ϕpik )(b
                                                       c∗ik )k ≤ ε(δ) −→ 0.
                                                                              δ→0

    The following lemma from [8] is valid.

Lemma 3. Let the game ΓH 0 = hI, {Xi0 }n1 , {Hi0 }n1 i be obtained from the game
ΓH = hI, {Xi }n1 , {Hi }n1 i by the epimorphic mapping αi : Xi → Xi0 , i = 1, . . . , n,
with
           kH(x) − H 0 (αx)k ≤ ε,           αx = (α1 (x1 ), . . . , αn (xn )).
Then, if x is an ε-equilibrium of the game ΓH , then αx is the 3ε-equilibrium of
the game ΓH 0 .

    Let us define the main game Γ (c0 , T ).

Definition 3. The pair (δi , {δ ϕpi i }δ=1/2N ) is called the strategy of the agent i.
Here N ∈ Z, δi is a range of dyadic partition of the time interval [0, T ] and
δ pi
 ϕi is the strategy of the agent i in the game Γpδi (c0 , T ) for the permutation
pi = (i, pe) and pe is the permutation of the set I \ i.
232     M. Troeva, V. Lukin

    For n-tuple ϕ = (ϕ1 , . . . , ϕn ) the game Γ (c0 , T ) is played as follows. The
smallest δi = δ is chosen and the trajectory χ(·) is constructed for n-tuple
δ
  ϕ = (δ ϕp11 , . . . , δ ϕpnn ). This trajectory is unique.
    The game Γ (c0 , T ) is obtained from the game Γpδ (c0 , T ) by the epimorphic
mapping which is defined in Lemma 2. Since in the game Γpδ (c0 , T ) there exists
ε-equilibrium, then the existence of the 3ε-equilibrium in the game Γ (c0 , T )
follows from Lemma 2 and Lemma 3.
    Thus, the following theorem is valid.
Theorem 1. There exists ε-equilibrium in the noncooperative n-person differ-
ential game Γ (c0 , T ) for all ε > 0.


4     Numerical Method
The numerical method based on the dynamic programming method [2] and the
finite difference scheme [13] is proposed for the numerical solving of the auxuliary
multistep game Γpδ (c0 , T ).
    To construct a difference scheme for our problem, we use an approach that
was developed in [13] for constructing a homogeneous difference scheme for the
nonstationary heat conduction problem with a one-point heat source that is
defined by a Dirac delta function expression.
    On the rectangle Ω = [0, d1 ] × [0, d2 ], we construct the uniform grid with the
step h1 on x1 and the step h2 on x2

               ω h = {x1l = lh1 , l = 0, . . . , N1 ; x10 = 0, x1N1 = d1 ;
                                                                                        (15)
                      x2k = kh2 ; k = 0, . . . , N2 ; x20 = 0, x2N2 = d2 } ,

For simplicity, the points of the agents’ dumps are assumed to be grid nodes,
namely (x1l , x2k ) = (x̄i1 , x̄i2 ) is a location of the agent i, i = 1, 2.
   On every interval [ts , ts+1 ], s = 0, Nσ − 1, we construct the uniform net with
step τ
                ω τ,s = {t̄j = jτ, j = 0, N3 ; t̄0 = ts , t̄N3 = ts+1 }.
Here ts ∈ σ, where σ is the time interval partition

                         σ = {t0 = 0 < t1 < . . . < tNσ = T }.

    For numerical simulations, we will consider the admissible control parameters
            1     2            1            2
set Ui = [U i , U i ], where U i = const, U i = const, i = 1, 2, and construct the
following partition on the set Ui :
                                 1                             2
               ∆i = {ui,0 = U i < ui,1 < . . . < ui,N4 = U i }, i = 1, 2.

    On the time interval [ts , ts+1 ] we will consider grid functions i y s (x1l , x2k , t̄j )
instead of functions z i (x1 , x2 , t) of continuous arguments (x1 , x2 , t) ∈ Ω×[ts , ts+1 ],
where i ∈ I – the agent’s number, s = 0, Nσ − 1. Here the argument of the grid
function (x1l , x2k , t̄j ) is a node of the grid ω hτ = ω h × ω τ,s .
        Numerical simulation of a game-theoretic model of pollution problem                             233

                        j,s
    Let us denote by i yl,k = i y s (x1l , x2k , t̄j ) the grid function defined on the net
ω hτ = ω h × ω τ,s .
    We construct for the problem (2)–(7) the following purely implicit difference
schemes [13] on the net ω hτ = ω h × ω τ,s for any pair of admissible controls
(u1,ξ1 , u2,ξ2 ) ∈ ∆1 × ∆2 , ξi ∈ 0, N4 , i = 1, 2:
            i j+1/2,s      j,s          i j+1/2,s           j+1/2,s
             y0,k     − i y0,k           y1,k       − i y0,k              j,s
                                      = 2D                          − i y0,k  r,                        (16)
                    τ                               h21
             i j+1/2,s      j,s       i j+1/2,s             j+1/2,s      j+1/2,s
              yl,k     − i yl,k        yl+1,k − 2i yl,k             + i yl−1,k
                                =D                                               −
                    τ                                        h21
                                     j,s              i
                                − i yl,k r + usi,ξi δ l,k ,          l = 1, N1 − 1,                     (17)
           i j+1/2,s    j,s                       i j+1/2,s    j+1/2,s
            yN1 ,k − i yN 1 ,k
                                                   yN1 ,k − i yN1 −1,k      j,s
                                      = −2D                            − i yN 1 ,k
                                                                                   r,                   (18)
                      τ                                     h2
                                                                                        k = 0, N2 ,


        i j+1,s    j+1/2,s         i j+1,s      j+1,s
         yl,0 − i yl,0              yl,1 − i yl,0               j+1/2,s
                                   = 2D                    − i yl,0     r,                              (19)
                τ                          h22
        i j+1,s    j+1/2,s       i j+1,s        j+1,s          j+1,s
         yl,k − i yl,k            yl,k+1 − 2i yl,k       + i yl,k−1
                           =D                                        −
                τ                              h22
                                j+1/2,s              i
                           − i yl,k     r + usi,ξi δ l,k ,        k = 1, N2 − 1,                        (20)
        i j+1,s     j+1/2,s                     i j+1,s     j+1,s
         yl,N2 − i yl,N2                         yl,N2 − i yl,N2 −1      j+1/2,s
                                   = −2D                            − i yl,N2 r,                        (21)
                  τ                                     h22
                                                                                       l = 0, N1 ,
                                                                                       j = 0, N3 − 1,

                           i 0,s      N3 ,s−1
                            yl,k = i yl,k     ,       l = 0, N1 , k = 0, N2 ,                           (22)
                                                ξi = 0, N4 ,
                                              s = 1, Nσ − 1,

                          i 0,0
                           yl,k = ci0 (x1l , x2k ),     l = 0, N1 , k = 0, N2 ,                         (23)
Here                                  
                              i
                                       0,        if (x1l , x2k ) 6= (x̄i1 , x̄i2 ),
                            δ l,k =        1
                                                , if (x1l , x2k ) = (x̄i1 , x̄i2 ),
                                          h1 h2
                                      

   The constructed absolutely stable difference scheme (16)–(23) is solved by
the elimination method [13].
234        M. Troeva, V. Lukin

   The payoff function of the agent i, i = 1, 2 in the game Γpδ (c0 , T ) is approx-
imated as follows:
                                                                    σ −1 N
                                                                   NX    X2 −1
                                σ −1                    σ −1
          H i (u01 , . . . , uN
                              1      , u02 , . . . , uN
                                                      2      )=τ                 Pi usi (1 − pfij ) −
                                                                   s=0     j=0
                                                                    σ −1 N
                                                                   NX    X2 −1

                                                           −τ                    Mi usi ,                 (24)
                                                                   s=0     j=0


   Let V δi (·) be the value of the payoff function of the agent i, i = 1, 2 at the
equilibrium point. The following recurrence equations are valid:


                                                                         σ −1   Nσ −1
       V δi (1 y Nσ −1 , 2 y Nσ −1 , tNσ −1 ) =        max      {H i (uN
                                                                       i,ξi , ū{I\i},ξ{I\i} )},          (25)
                                                  uN σ −1 ∈∆
                                                   i,ξi     i


    V δi (1 y s , 2 y s , ts ) = smax {H i (usi,ξi , ūs{I\i},ξ{I\i} ) + V δi (1 yξs+1
                                                                                    1
                                                                                       , 2 yξs+1
                                                                                              2
                                                                                                 , ts+1 )},
                            ui,ξ ∈∆i
                                 i

                                                                       s = Nσ − 2, 0,                     (26)


                                        V δi (1 y 0 , 2 y 0 , tNσ ) = 0,                                  (27)

Here
                                               2 −1
                                              NX                                   2 −1
                                                                                  NX
        H i (usi,ξi , ūs{I\i},ξ{I\i} ) = τ           Pi usi,ξi (1 − pfij ) − τ             Mi usi,ξi .   (28)
                                              j=0                                  j=0


5      Numerical Results
The numerical experiments were carried out for the following input data: D =
4.4, d1 = d2 = 30, h = 0.5, T = 320, τ = 1, Nσ = 8, r = 0.005, p = 3.1,
M1 = 4.5, M2 = 5.5, P1(2) = 15, U1i = {1, 10, 20, 30}, U2j = {1, 10, 20, 30},
(x̄11 , x̄12 ) = (5, 5), (x̄21 , x̄22 ) = (25, 25), (x̄1w , x̄2w ) = (15, 5), Cw = 3, c0 (x) = 0.
      The results of numerical experiments are presented in Figures 1–5.
      Figure 1 presents the computed distribution of the pollutant concentration
of the first and second agents. The realizations of their optimal strategies are
shown in Fig. 2. The agents reduce the intensity of dumping the pollutant (stop
production) to minimize fines. Then the intensity of production is resumed to
the maximum values for obtaining the maximum income by the time T . The
time development of the payoff functions for both agents is given in Fig. 3. The
dynamics of changing the pollutant concentration at the intake point is presented
in the Fig. 4. The agent pays the penalty (Fig. 5) in the case of exceeding the
maximum permissible value Cw .
        Numerical simulation of a game-theoretic model of pollution problem          235


Fig. 1. Distribution of the pollutant concentration of agents at the moment T : 1 – light
color, 2 – dark.


 Fig. 2. Optimal strategies of agents: 1 – blue dashed line, 2 – red dot-dashed line.
236     M. Troeva, V. Lukin


   Fig. 3. Payoff functions of agents: 1 – blue dashed line, 2 – red dot-dashed line.


Fig. 4. The pollutant concentrations at the point of intake: total – green dotted line,
1 – blue dashed line, 2 – red dot-dashed line.
         Numerical simulation of a game-theoretic model of pollution problem          237


    Fig. 5. The penalty functions f (t): 1 – blue dashed line, 2 – red dot-dashed line.


6     Conclusion
We investigated the noncooperative two-person differential game in Banach space
which models a conflict-controlled process of the contaminating a closed water
reservoir. The dynamics of the agents is described by the initial boundary value
problem for the two-dimensional diffusion equation with a point source.
    The proposed numerical algorithm for solving the considered differential
game is based on the dynamic programming method and the finite difference
scheme. It has been applied to compute the auxiliary multistep game.


Acknowledgement
The authors are grateful to the anonymous reviewers for their valuable comments
and suggestions which helped to improve the quality of the paper.


References
 1. Amann, H.: Nonautonomous parabolic equations involving measures. J. Math. Sci.
    130(4), 4780–4802 (2005)
 2. Bellman, R.: Dynamic programming. Inostrannaya Literatura, Moscow (1960)
 3. Botkin, N.D., Hoffmann, K.-H., Turova V.L.: Optimal control of ice formation
    in living cells during freezing. Applied Mathematical Modelling 35(8), 4044–4057
    (2011)
 4. Breton, M., Zaccour, G., Zahaf, M.: A Differential game of joint implementation
    of environmental projects. Automatica 41(10), 1737–1749 (2005)
 5. Jorgensen, S., Zaccour, G.: Incentive equilibrium strategies and welfare allocation
    in a dynamic game of pollution control. Automatica 3(1), 29–36 (2001)
238     M. Troeva, V. Lukin

 6. Krasovskii, N.A., Tarasyev, A.M.: Decomposition algorithm of searching equilibria
    in the dynamical game. Matematicheskaya Teoriya Igr i Ee Prilozheniya 3(4), 49–
    88 (2011)
 7. Ladyzhenskaya, O.A.: The boundary-value problems of mathematical physics.
    Springer-Verlag New York Inc., New York (1985)
 8. Malafeyev, O.A.: Dynamical control systems of conflict. St. Petersburg University
    Press, St. Petersburg (2000) (in Russian)
 9. Martin-Herran, G., Cartigny, P., Motte, E., Tidball M.: Deforestation and foreign
    transfers: A Stackelberg differential game approach. Computers and Operations
    Research 33(2), 386–400 (2006)
10. Masoudi, N., Santugini, M., Zaccour, G.: A dynamic game of emissions pollution
    with uncertainty and learning. Environmental Resource Economics 64(3), 349–372
    (2016)
11. Mazalov, V.V., Rettieva, A.N.: Fish wars and cooperation maintenance. Ecological
    Modelling 221, 1545–1553 (2010)
12. Petrosjan, L.A., Zakharov, V.V.: Mathematical models in environment policy anal-
    ysis. Nova Science Publishers Inc., New York (1997)
13. Samarskii, A.A.: The theory of difference schemes. Marcel Dekker Inc., New York-
    Basel (2001)
14. Troeva, M., Lukin, V.: On a game-theoretic model of environmental pollution prob-
    lem. In: Advances in Dynamic Games, vol. 13. pp. 223–236. Birkhäuser, Cham
    (2013)
15. Zenkevich, N., Kozlovskaya, N.: Stable Shapley value in cooperative game of ter-
    ritorial enviromental production. Matematicheskaya Teoriya Igr i Ee Prilozheniya
    2, 67–92 (2010)