Numerical Simulation of a Game-Theoretic Model of Environmental Pollution Problem ? ?? Marianna Troeva[0000−0001−7046−620X] and Vassili Lukin[0000−0001−8608−9402] North-Eastern Federal University, Yakutsk, Russia troeva@mail.ru, lukinvs14@mail.ru Abstract. In this paper, we present a numerical simulation of the game- theoretic model of an environmental pollution problem. This model is formalized by a noncooperative two-person differential game in Banach space with separated dynamics of the agents and continuous payoff func- tions depending on a game trajectory. The numerical simulation is based on the dynamic programming approach and the finite difference method. Some numerical results are provided for two-dimensional dynamic con- flict model of an environmental pollution problem. Keywords: Noncooperative differential games · Numerical simulation · Environmental pollution. 1 Introduction Application of the game-theoretic approach plays a special role in mathematical modeling of environmental problems. The dynamic games that deal with similar or related problems were investigated in many works, see e.g. [3–6, 9–12, 15] and references therein. In [14], the existence of ε-Nash equilibrium in a dynamic conflict model of an environmental pollution problem which is formalized by the noncooperative n-person differential game in Banach space was proved. In this paper, we present a numerical simulation for the two-dimensional dy- namic conflict model of an environmental pollution problem. Enterprises (agents) contaminate a water reservoir by dumping a pollutant (harmful substance) of the same type during the production process. This model is formalized by a noncooperative two-person differential game in Banach space with separated dynamics of the agents and continuous payoff functions depending on a game trajectory. For simplicity, we consider the case of separated dynamics of agents, in which the dynamics of each agent is described by the initial boundary value problem for the parabolic equation involving Dirac ? The work is supported by the Ministry of Science and Higher Education of the Russian Federation, supplementary agreement No. 075-02-2020-1543/1, April 29, 2020. ?? Copyright c 2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). Numerical simulation of a game-theoretic model of pollution problem 227 measure. The existence of ε-Nash equilibrium in our model follows from the results of [14]. The paper gives a formal mathematical description of the model and related numerical method and presents simulation results. 2 The Model A closed water reservoir (e.g. lake) is considered. Two enterprises dump a pol- lutant (harmful substance) of the same type into this water reservoir during the process of production. Both enterprises dump the pollutant at some prescribed points of the reservoir. Furthermore, it is assumed that the reservoir has a water intake. The level of pollution at the water intake point (the total concentration of the pollutant released by all enterprises) must not exceed the maximum permissible value. It is assumed that if this value is exceeded, all the enterprises will pay a fine (penalty) as a percentage of their income. It is also assumed that the company has certain expenses associated with the cleaning of the pollutant. The spread of the harmful substances in the reservoir occurs by diffusion. Besides, a pollutant is decomposed with the rate r > 0. The total income of an enterprise depends on its volume of production, which is tightly linked with its total volume of dumped pollutant. Besides, the total income depends on the overall cleaning expenses and possible pollution fines. The aim of each enterprise is to maximize the total income for finite period of time. 3 Differential Game We will study the differential two-person game Γ (c0 , T ) with a prescribed dura- tion T < ∞ and an initial position of the game c0 . Let I = {i} = {1, 2} be a set of the agents (enterprises). We will consider the closed water reservoir as a rectangle Ω = [0, d1 ] × [0, d2 ]. Let z i (x1 , x2 , t) be the pollutant concentration of the agent i at the point (x1 , x2 ) ∈ Ω at the moment t ∈ [0, T ]. Denote by (x̄i1 , x̄i2 ) ∈ Ω the prescribed point of dumping the pollutant of the agent i ∈ I and by (xw , yw ) the coordinates of the water intake location inside the domain Ω. Let us denote by ui (t) the intensity of dumping the pollutant of the agent i ∈ I at the moment t. We assume that the intensity of dumping the pollutant satisfies the following conditions: 0 ≤ ui (t) ≤ Gi (t), i ∈ I, (1) at any moment t ∈ [0, T ]. Here Gi (t) > 0 is a given square integrable function which describes the maximal intensity of dumping the pollutant of the agent i at the moment t. Let us assume that the production expenditures per unit product of the enterprise i are constant and equal to Mi > 0, i ∈ I. 228 M. Troeva, V. Lukin The dynamics of the agent i = 1, 2 in the game Γ (c0 , T ) is described by the initial boundary value problem for the following differential equation on the domain Ω = (0, d1 ) × (0, d2 ) : ∂z i ∂2zi ∂2zi = D 2 + D 2 − rz i + ui ψi (x1 , x2 ), x ∈ Ω, t > 0. (2) ∂t ∂x1 ∂x2 Here D > 0 is the diffusion coefficient; r > 0 is the coefficient characterizing the pollution decomposition; ui = ui (t) is a control function of the agent i, satisfying the condition (1). The function ψi (x1 , x2 ) = δ(x1 − x̄i1 , x2 − x̄i2 ) gives the location of the agent i inside the domain Ω. Let the function z i (x1 , x2 , t) satisfies the following boundary conditions of impenetrability: ∂z i = 0, x1 = 0, x2 ∈ [0, d2 ], t > 0, (3) ∂x1 ∂z i = 0, x1 = d1 , x2 ∈ [0, d2 ], t > 0, (4) ∂x1 ∂z i = 0, x1 ∈ [0, d1 ], x2 = 0, t > 0, (5) ∂x2 ∂z i = 0, x1 ∈ [0, d1 ], x2 = d2 , t > 0, (6) ∂x2 and the following initial condition: z i (x1 , x2 , 0) = ci0 (x1 , x2 ), x1 ∈ [0, d1 ], x2 ∈ [0, d2 ], t = 0, (7) where ci0 (x1 , x2 ) is some given function describing the initial distribution of the pollutant concentration of the agent i in the water reservoir at the initial moment t = 0. Definition 1. A measurable function ui = ui (t), satisfying the condition (1) for all t ∈ [0, T ] is called the admissible control of the agent i ∈ I. Let us denote by U i ⊂ Lp (0, T ), i ∈ I, the set of admissible controls (measurable functions) ui (t), t ∈ [0, T ]. Let the function fi (t) ≥ 0 for all t ∈ [0, T ] determine the amount of the penalty of the agent i for exceeding the maximum permissible value of pollution at the water intake point (xw , yw ) as follows: 2  z j (xw , yw , t) ≤ Cw , P    0, if j=1      2 z j (xw , yw , t) − Cw P fi (t)= z i (xw , yw , t) j=1 2 z j (xw , yw , t) > Cw , P ·    2 , if   P j Cw j=1  z (xw , yw , t)   j=1 Numerical simulation of a game-theoretic model of pollution problem 229 where Cw is the maximum permissible value of pollution at the water intake point. Denote by qi = qi (t) the volume of production of the agent i at the moment t. Let P̄i > 0 be the price of the product of the agent i. Assume that the intensity of dumping the pollutant ui (t) linearly depends on the production volume of the agent i: ui (t) = αqi (t), α > 0. Let p̄ > 0 be the payment for the discharge of a unit of pollutant. Then the payoff of the agent i at time T is defined by the following functional: ZT ZT Hi (z, ui ) = Pi ui (τ )(1 − pfi (τ ))dτ − Mi ui (τ )dτ, (8) 0 0 where z = (z 1 , z 2 ), Pi = P̄i /α, p = p̄/Pi , Mi > 0 is the production expenditures per unit product of the enterprise i. The goal of the agent i is to maximize Hi (·). Game Γ (c0 , T ) is a particular case of a noncooperative n-person differential game in Banach space that was studied in [14]. Below, for the sake of complete- ness, let us recall the main results of [14]. The dynamics of the agent i ∈ I = {i} = {1, . . . , n} in the n-person differ- ential game Γ (c0 , T ) is described by the initial boundary value problem for the following differential equation: ∂z i ∂z i ∂z i     ∂ ∂ = D(x, y, t) + D(x, y, t) − ∂t ∂x ∂x ∂y ∂y −rz i + ui ψi (x, y), (x, y) ∈ Ω ∈ R2 , t > 0. (9) Here D(x, y, t) > 0 is the diffusion coefficient; r > 0 is the coefficient char- acterizing the pollution decomposition; ui ∈ Ui is a control parameter of the agent i, Ui ⊂ Rmi is a compact set in Euclidean space. The function ψi (x, y) = δ(x − xi , y − yi ) gives the location of the agent i inside the domain Ω. Let the function z i (x, y, t) satisfies the following boundary and initial condi- tions: ∂z i D(t, x, y) = 0, (x, y) ∈ S, t ∈ [0, T ], (10) ∂m z i (x, y, 0) = ci0 (x, y), (x, y) ∈ Ω, t = 0, (11) where m is an outward normal to the boundary surface S ×[0, T ], ci0 (x, y) is some given function describing the initial distribution of the pollutant concentration of the agent i in the water reservoir at the initial moment t = 0. Let us represent the problem (9)–(11) as the initial-value problem for the following operator-differential equation dci (t) − A(t)ci (t) = ν i (t), t ∈ [0, T ], (12) dt 230 M. Troeva, V. Lukin ci (0) = z0i = ci0 , (13) where ci (t) = z i (x, y, t), ν i (t) = ui (t) · δ(x − xi , y − yi ). The operator Ac = ∂x (D(t, x, y)cx ) + ∂y (D(t, x, y)cy ) − rc allows for the boundary condition (10). The equation (12) involves the Dirac measure. The existence of a unique solution of abstract parabolic evolution equations involving Banach space-valued Radon measures is proved in [1]. We assume that the coeffitients D and r in (9)–(11) satisfy the following con- ditions D(t, x, y) ∈ C([0, T ]; C 1 (Ω)), r ∈ L∞ (0, T ; Lq (Ω)). According to results of [1], the unique solution ci ∈ Lp (0, T ; Wp1 (Ω)), cit ∈ Lp (0, T ; (Wq1 (Ω))0 ), i ∈ I of the problem (12)–(13) exists for all ν i ∈ Lp (0, T ; (Wq1 (Ω))0 ), for all admissible 2/p−1 control ui ∈ U i ⊂ Lp (0, T ), and for all initial condition ci0 ∈ (Wq (Ω))0 and q > 2 (1/p + 1/q = 1). Let us denote by Fi (ci0 , t0 , t) the set of the points ci (·) ∈ Wp1 (Ω) for which there exists an admissible control ui (t) such that the game goes from the state ci (t0 ) = ci0 to the state ci (t + t0 ) for the time interval [t0 , t]. The set Fi (ci0 , t0 , t) is a bounded set of the space Wp1 (Ω). It is known [7] that if the boundary of the domain Ω is smooth then a bounded set of the space Wp1 (Ω) is a compact set in 2/p−1 Lp (Ω). This implies that Fi (ci0 , t0 , t) is a compact set for all ci0 ∈ (Wq (Ω))0 , 2/p−1 t0 , t ∈ [0, T ] as well. Fi (ci0 , t0 , t0 ) = ci0 for all ci0 ∈ (Wq (Ω))0 , t0 ∈ [0, T ]. i The set Fi (c0 , t0 , t) has semigroup property. The set Fi (ci0 , t0 , t) is called the attainability set of the player i, i = 1, n from the initial state ci0 on the time interval [t0 , t]. Let us denote by Fbi (ci0 , t0 , t), i ∈ I the set of trajectories ĉi (ci0 , t − t0 ) of (12)–(13) which start at ci0 at the moment t0 and which are defined on the time interval [t0 , t]. The set of trajectories Fbi (ci0 , t0 , t) is compact e.g. in the space Lp (0, T ; Wp1−s (Ω)) for any s > 0, and the function Fbi (ci0 , t0 , t) is continuous in the corresponding Hausdorff metric. At every moment t ∈ [0, T ] of the game Γ (c0 , T ) the agents know the realized trajectory of the game, the dynamics and the duration T of the game. Let ĉi (·) ∈ Fbi (ci0 , 0, T ) be the trajectory of (12)–(13) arising from a control ui and Πδi (ĉi ) be the trajectory arising from the same control ui delayed by δT . The following lemma describes the relation between these trajectories. Lemma 1. For each δ ∈ (0, 1] there exists a map Πδi : Fbi (ci0 , 0, T ) → Fbi (·) such that, if ĉi (τ ) = ĉ0i (τ ) for τ ∈ [0, t], then Πδi (ĉi )(τ ) = Πδi (ĉ0i )(τ ) for τ ∈ [0, t+δT ] if (t + δT ) ≤ T and Πδi (ĉi )(τ ) = Πδi (ĉ0i )(τ ) for τ ∈ [0, T ] if (t + δT ) > T . Moreover, εi (δ) = sup kĉi − Πδi (ĉi )k −→ 0. δ→0 ĉi ∈F bi (·) Let us fix the permutation p = (i1 , . . . , ik , . . . , in ) and consider n-person multistep game Γpδ (c0 , T ) at every step which the agents i1 , . . . , in choose in sequence controls ui1 , . . . , uik , . . . , uin . Numerical simulation of a game-theoretic model of pollution problem 231 Definition 2. The strategy Y δ ϕpik : Fbi∗k (·) = Fbj (·) → Fbik (·), j6=ik of the agent ik in the game Γpδ (c0 , T ) is a mapping such that if ĉj (τ ) = ĉ0j (τ ) for j < ik , τ ∈ [0, lδT ] and if ĉj (τ ) = ĉ0j (τ ) for j > ik , τ ∈ [0, (l − 1)δT ], then 0 δ p ϕik (ĉ∗ik (τ )) = δ ϕpik (ĉ∗ ik (τ )), τ ∈ [0, lδT ]. Here δ = 1/2N , l = 1, 2, . . . , 2N . Let us denote by δ Φpik the set of the strategies of the agent ik in the game Γp (c0 , T ). In the game Γpδ (c0 , T ) the players i1 , ..., in choose in sequence the δ strategies δ ϕpi1 , . . . , δ ϕpin . The trajectory χ(δ ϕp ) is uniquely defined for every n- tuple δ ϕp = (δ ϕpi1 , . . . , δ ϕpin ) stepwise on successive intervals [0, δT ], . . ., [T − δT, T ]. The payoff function of the agent i ∈ I in the game Γpδ (c0 , T ) is defined as follows: Hiδ (c0 , δ ϕp ) = Hi (χδ (δ ϕp )), (14) here Hi (·) is the functional of the same kind as (8). So, the n-person differential game Γpδ (c0 , T ) with the prescribed duration T is defined in a normal form: Γpδ (c0 , T ) = hI, {δ Φpi }n1 , {Hiδ }n1 i. Using the Zermelo-Neumann theorem, the existence of ε-equilibrium for any ε > 0 in the multistep game Γpδ (c0 , T ) can be proved. The previous Lemma 1 implies the following lemma. pi Lemma 2. If ik > i1 , δ ϕpik ∈ δ Φpik , then Πδik · δ ϕpik ∈ δ Φikk , where pik = (ik , pe), pe is a permutation of the set I \ ik ; moreover, for b c∗ik ∈ Fbi∗k (·) kδ ϕpik (b c∗ik ) − (Πδik · δ ϕpik )(b c∗ik )k ≤ ε(δ) −→ 0. δ→0 The following lemma from [8] is valid. Lemma 3. Let the game ΓH 0 = hI, {Xi0 }n1 , {Hi0 }n1 i be obtained from the game ΓH = hI, {Xi }n1 , {Hi }n1 i by the epimorphic mapping αi : Xi → Xi0 , i = 1, . . . , n, with kH(x) − H 0 (αx)k ≤ ε, αx = (α1 (x1 ), . . . , αn (xn )). Then, if x is an ε-equilibrium of the game ΓH , then αx is the 3ε-equilibrium of the game ΓH 0 . Let us define the main game Γ (c0 , T ). Definition 3. The pair (δi , {δ ϕpi i }δ=1/2N ) is called the strategy of the agent i. Here N ∈ Z, δi is a range of dyadic partition of the time interval [0, T ] and δ pi ϕi is the strategy of the agent i in the game Γpδi (c0 , T ) for the permutation pi = (i, pe) and pe is the permutation of the set I \ i. 232 M. Troeva, V. Lukin For n-tuple ϕ = (ϕ1 , . . . , ϕn ) the game Γ (c0 , T ) is played as follows. The smallest δi = δ is chosen and the trajectory χ(·) is constructed for n-tuple δ ϕ = (δ ϕp11 , . . . , δ ϕpnn ). This trajectory is unique. The game Γ (c0 , T ) is obtained from the game Γpδ (c0 , T ) by the epimorphic mapping which is defined in Lemma 2. Since in the game Γpδ (c0 , T ) there exists ε-equilibrium, then the existence of the 3ε-equilibrium in the game Γ (c0 , T ) follows from Lemma 2 and Lemma 3. Thus, the following theorem is valid. Theorem 1. There exists ε-equilibrium in the noncooperative n-person differ- ential game Γ (c0 , T ) for all ε > 0. 4 Numerical Method The numerical method based on the dynamic programming method [2] and the finite difference scheme [13] is proposed for the numerical solving of the auxuliary multistep game Γpδ (c0 , T ). To construct a difference scheme for our problem, we use an approach that was developed in [13] for constructing a homogeneous difference scheme for the nonstationary heat conduction problem with a one-point heat source that is defined by a Dirac delta function expression. On the rectangle Ω = [0, d1 ] × [0, d2 ], we construct the uniform grid with the step h1 on x1 and the step h2 on x2 ω h = {x1l = lh1 , l = 0, . . . , N1 ; x10 = 0, x1N1 = d1 ; (15) x2k = kh2 ; k = 0, . . . , N2 ; x20 = 0, x2N2 = d2 } , For simplicity, the points of the agents’ dumps are assumed to be grid nodes, namely (x1l , x2k ) = (x̄i1 , x̄i2 ) is a location of the agent i, i = 1, 2. On every interval [ts , ts+1 ], s = 0, Nσ − 1, we construct the uniform net with step τ ω τ,s = {t̄j = jτ, j = 0, N3 ; t̄0 = ts , t̄N3 = ts+1 }. Here ts ∈ σ, where σ is the time interval partition σ = {t0 = 0 < t1 < . . . < tNσ = T }. For numerical simulations, we will consider the admissible control parameters 1 2 1 2 set Ui = [U i , U i ], where U i = const, U i = const, i = 1, 2, and construct the following partition on the set Ui : 1 2 ∆i = {ui,0 = U i < ui,1 < . . . < ui,N4 = U i }, i = 1, 2. On the time interval [ts , ts+1 ] we will consider grid functions i y s (x1l , x2k , t̄j ) instead of functions z i (x1 , x2 , t) of continuous arguments (x1 , x2 , t) ∈ Ω×[ts , ts+1 ], where i ∈ I – the agent’s number, s = 0, Nσ − 1. Here the argument of the grid function (x1l , x2k , t̄j ) is a node of the grid ω hτ = ω h × ω τ,s . Numerical simulation of a game-theoretic model of pollution problem 233 j,s Let us denote by i yl,k = i y s (x1l , x2k , t̄j ) the grid function defined on the net ω hτ = ω h × ω τ,s . We construct for the problem (2)–(7) the following purely implicit difference schemes [13] on the net ω hτ = ω h × ω τ,s for any pair of admissible controls (u1,ξ1 , u2,ξ2 ) ∈ ∆1 × ∆2 , ξi ∈ 0, N4 , i = 1, 2: i j+1/2,s j,s i j+1/2,s j+1/2,s y0,k − i y0,k y1,k − i y0,k j,s = 2D − i y0,k r, (16) τ h21 i j+1/2,s j,s i j+1/2,s j+1/2,s j+1/2,s yl,k − i yl,k yl+1,k − 2i yl,k + i yl−1,k =D − τ h21 j,s i − i yl,k r + usi,ξi δ l,k , l = 1, N1 − 1, (17) i j+1/2,s j,s i j+1/2,s j+1/2,s yN1 ,k − i yN 1 ,k yN1 ,k − i yN1 −1,k j,s = −2D − i yN 1 ,k r, (18) τ h2 k = 0, N2 , i j+1,s j+1/2,s i j+1,s j+1,s yl,0 − i yl,0 yl,1 − i yl,0 j+1/2,s = 2D − i yl,0 r, (19) τ h22 i j+1,s j+1/2,s i j+1,s j+1,s j+1,s yl,k − i yl,k yl,k+1 − 2i yl,k + i yl,k−1 =D − τ h22 j+1/2,s i − i yl,k r + usi,ξi δ l,k , k = 1, N2 − 1, (20) i j+1,s j+1/2,s i j+1,s j+1,s yl,N2 − i yl,N2 yl,N2 − i yl,N2 −1 j+1/2,s = −2D − i yl,N2 r, (21) τ h22 l = 0, N1 , j = 0, N3 − 1, i 0,s N3 ,s−1 yl,k = i yl,k , l = 0, N1 , k = 0, N2 , (22) ξi = 0, N4 , s = 1, Nσ − 1, i 0,0 yl,k = ci0 (x1l , x2k ), l = 0, N1 , k = 0, N2 , (23) Here  i  0, if (x1l , x2k ) 6= (x̄i1 , x̄i2 ), δ l,k = 1 , if (x1l , x2k ) = (x̄i1 , x̄i2 ), h1 h2  The constructed absolutely stable difference scheme (16)–(23) is solved by the elimination method [13]. 234 M. Troeva, V. Lukin The payoff function of the agent i, i = 1, 2 in the game Γpδ (c0 , T ) is approx- imated as follows: σ −1 N NX X2 −1 σ −1 σ −1 H i (u01 , . . . , uN 1 , u02 , . . . , uN 2 )=τ Pi usi (1 − pfij ) − s=0 j=0 σ −1 N NX X2 −1 −τ Mi usi , (24) s=0 j=0 Let V δi (·) be the value of the payoff function of the agent i, i = 1, 2 at the equilibrium point. The following recurrence equations are valid: σ −1 Nσ −1 V δi (1 y Nσ −1 , 2 y Nσ −1 , tNσ −1 ) = max {H i (uN i,ξi , ū{I\i},ξ{I\i} )}, (25) uN σ −1 ∈∆ i,ξi i V δi (1 y s , 2 y s , ts ) = smax {H i (usi,ξi , ūs{I\i},ξ{I\i} ) + V δi (1 yξs+1 1 , 2 yξs+1 2 , ts+1 )}, ui,ξ ∈∆i i s = Nσ − 2, 0, (26) V δi (1 y 0 , 2 y 0 , tNσ ) = 0, (27) Here 2 −1 NX 2 −1 NX H i (usi,ξi , ūs{I\i},ξ{I\i} ) = τ Pi usi,ξi (1 − pfij ) − τ Mi usi,ξi . (28) j=0 j=0 5 Numerical Results The numerical experiments were carried out for the following input data: D = 4.4, d1 = d2 = 30, h = 0.5, T = 320, τ = 1, Nσ = 8, r = 0.005, p = 3.1, M1 = 4.5, M2 = 5.5, P1(2) = 15, U1i = {1, 10, 20, 30}, U2j = {1, 10, 20, 30}, (x̄11 , x̄12 ) = (5, 5), (x̄21 , x̄22 ) = (25, 25), (x̄1w , x̄2w ) = (15, 5), Cw = 3, c0 (x) = 0. The results of numerical experiments are presented in Figures 1–5. Figure 1 presents the computed distribution of the pollutant concentration of the first and second agents. The realizations of their optimal strategies are shown in Fig. 2. The agents reduce the intensity of dumping the pollutant (stop production) to minimize fines. Then the intensity of production is resumed to the maximum values for obtaining the maximum income by the time T . The time development of the payoff functions for both agents is given in Fig. 3. The dynamics of changing the pollutant concentration at the intake point is presented in the Fig. 4. The agent pays the penalty (Fig. 5) in the case of exceeding the maximum permissible value Cw . Numerical simulation of a game-theoretic model of pollution problem 235 Fig. 1. Distribution of the pollutant concentration of agents at the moment T : 1 – light color, 2 – dark. Fig. 2. Optimal strategies of agents: 1 – blue dashed line, 2 – red dot-dashed line. 236 M. Troeva, V. Lukin Fig. 3. Payoff functions of agents: 1 – blue dashed line, 2 – red dot-dashed line. Fig. 4. The pollutant concentrations at the point of intake: total – green dotted line, 1 – blue dashed line, 2 – red dot-dashed line. Numerical simulation of a game-theoretic model of pollution problem 237 Fig. 5. The penalty functions f (t): 1 – blue dashed line, 2 – red dot-dashed line. 6 Conclusion We investigated the noncooperative two-person differential game in Banach space which models a conflict-controlled process of the contaminating a closed water reservoir. The dynamics of the agents is described by the initial boundary value problem for the two-dimensional diffusion equation with a point source. The proposed numerical algorithm for solving the considered differential game is based on the dynamic programming method and the finite difference scheme. It has been applied to compute the auxiliary multistep game. Acknowledgement The authors are grateful to the anonymous reviewers for their valuable comments and suggestions which helped to improve the quality of the paper. References 1. Amann, H.: Nonautonomous parabolic equations involving measures. J. Math. Sci. 130(4), 4780–4802 (2005) 2. Bellman, R.: Dynamic programming. Inostrannaya Literatura, Moscow (1960) 3. Botkin, N.D., Hoffmann, K.-H., Turova V.L.: Optimal control of ice formation in living cells during freezing. Applied Mathematical Modelling 35(8), 4044–4057 (2011) 4. Breton, M., Zaccour, G., Zahaf, M.: A Differential game of joint implementation of environmental projects. Automatica 41(10), 1737–1749 (2005) 5. Jorgensen, S., Zaccour, G.: Incentive equilibrium strategies and welfare allocation in a dynamic game of pollution control. Automatica 3(1), 29–36 (2001) 238 M. Troeva, V. Lukin 6. Krasovskii, N.A., Tarasyev, A.M.: Decomposition algorithm of searching equilibria in the dynamical game. Matematicheskaya Teoriya Igr i Ee Prilozheniya 3(4), 49– 88 (2011) 7. Ladyzhenskaya, O.A.: The boundary-value problems of mathematical physics. Springer-Verlag New York Inc., New York (1985) 8. Malafeyev, O.A.: Dynamical control systems of conflict. St. Petersburg University Press, St. Petersburg (2000) (in Russian) 9. Martin-Herran, G., Cartigny, P., Motte, E., Tidball M.: Deforestation and foreign transfers: A Stackelberg differential game approach. Computers and Operations Research 33(2), 386–400 (2006) 10. Masoudi, N., Santugini, M., Zaccour, G.: A dynamic game of emissions pollution with uncertainty and learning. Environmental Resource Economics 64(3), 349–372 (2016) 11. Mazalov, V.V., Rettieva, A.N.: Fish wars and cooperation maintenance. Ecological Modelling 221, 1545–1553 (2010) 12. Petrosjan, L.A., Zakharov, V.V.: Mathematical models in environment policy anal- ysis. Nova Science Publishers Inc., New York (1997) 13. Samarskii, A.A.: The theory of difference schemes. Marcel Dekker Inc., New York- Basel (2001) 14. Troeva, M., Lukin, V.: On a game-theoretic model of environmental pollution prob- lem. In: Advances in Dynamic Games, vol. 13. pp. 223–236. Birkhäuser, Cham (2013) 15. Zenkevich, N., Kozlovskaya, N.: Stable Shapley value in cooperative game of ter- ritorial enviromental production. Matematicheskaya Teoriya Igr i Ee Prilozheniya 2, 67–92 (2010)