=Paper= {{Paper |id=Vol-3003/short7 |storemode=property |title=Intelligent Decision-Making in Conditions of Uncertainty in Games with Nature |pdfUrl=https://ceur-ws.org/Vol-3003/short7.pdf |volume=Vol-3003 |authors=Pavlo Pyrohov,Ievgen Meniailov |dblpUrl=https://dblp.org/rec/conf/profitai/PyrohovM21 }} ==Intelligent Decision-Making in Conditions of Uncertainty in Games with Nature== https://ceur-ws.org/Vol-3003/short7.pdf
Intelligent Decision-Making in Conditions of Uncertainty in
Games with Nature
Pavlo Pyrohov, Ievgen Meniailov
  National Aerospace University “Kharkiv Aviation Institute”, Chkalow str., 17, Kharkiv, Ukraine

                 Abstract
                 Paper represents games with nature. The theory of the game with nature and it’s concept is
                 described. Artificial intelligence methods are applied to develop methods of decision-making
                 in conditions of uncertainty. The Wald criterion, the optimism criterion, the pessimism
                 criterion and the Savage criterions are described. The comparison of developed methods has
                 been done. Denoting the behavior of the game functions depends on the winnings a
                 corresponds to the first icon in the name of the criterion.

                 Keywords 1
                 Game with nature, artificial intelligence, decision-making, conditions of uncertainty.

1. Introduction
    Games with nature are mathematical models for which the choice of a decision depends on
objective reality. For example, customer demand, the state of nature, etc. [1-2].
    “Nature” is a generalized concept of an adversary who does not pursue his own goals in a given
conflict [3].
    To apply this theory, it is necessary to be able to represent conflicts in the form of games [4]. A
characteristic feature of any conflict is that none of the parties involved knows in advance exactly and
completely all their possible solutions, as well as the other parties, their future behavior and,
therefore, each is forced to act in conditions of uncertainty [5]. The uncertainty of the outcome can be
due to both the conscious actions of active opponents and unconscious, passive manifestations, for
example, of the elemental forces of nature: rain, sun, wind, avalanche, etc. In such cases, the
possibility of an accurate prediction of the outcome is excluded [6]. In some conflicts, the opposite
side is a consciously and purposefully acting active adversary who is interested in our defeat, who
deliberately prevents success, and achieves victory by any means [7-8]. In other conflicts, there is no
such a conscious enemy, but only the so-called "blind forces of nature" operate [9]: weather
conditions [10], the state of trade equipment at the enterprise [11], illness of employees [12], the
instability of the economic situation [13], market conditions [14], the dynamics of exchange rates
[15], the level of inflation [16], tax policy [17], changing purchasing demand, etc. With global
pandemic of COVID-19 [18] games with nature can be used for decision-making in preventing and
control measures to eliminate the epidemic dynamics [19]. In such cases, nature is not malicious and
acts passively, sometimes to the detriment of man, and sometimes to his benefit, but her state and
manifestation can significantly affect the result of the activity [20].
    In such games, a person tries to act prudently, for example, using a strategy that allows you to get
the least loss. The second player (nature) acts unintentionally, completely by accident, his possible
strategies are known (nature's strategies). Such situations are investigated using the theory of
statistical decisions [21]. Although there may well be situations in which nature can really act as a
player. For example, circumstances associated with weather conditions or with natural elemental
forces. Man's play with nature also reflects a conflict situation that arises when interests clash in


International Workshop of IT-professionals on Artificial Intelligence (ProfIT AI 2021), September 20-21, 2021, Kharkiv, Ukraine
EMAIL: pavel.vogorip@gmail.com (P. Pyrohov); evgenii.menyailov@gmail.com (I. Meniailov).
ORCID: 0000-0002-6100-4406 (P. Pyrohov); 0000-0002-9440-8378 (I. Meniailov).
            ©️ 2021 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)
choosing a solution. But “the elemental forces of nature” cannot be attributed to reasonable actions
directed against a person, and even more so any “malicious intent” [22]. Thus, it is more correct to
talk about a conflict situation caused by a clash of human interests and the uncertainty of nature's
actions, but without an obvious antagonistic coloration [23]. Situations in which the risk is associated
not with the conscious opposition of the opposite side (environment), but with insufficient awareness
of its behavior or the state of the decision-maker, are investigated using the theory of statistical
decisions.
    The aim of research is to investigate methods of decision-making in conditions of uncertainty in
games with nature.

2. Materials and methods
   Matrix of playing with nature:

                                                  А = || аij ||                                          (1)

where аij is the payoff of player 1 in the implementation of his pure strategy i and pure strategy j of
player 2 (nature) (i = 1, ..., m; j = 1, …, n).
   All possible states are considered as P1, P2, ..., Pn of nature P, which it calls randomly regardless of
the actions of player A without malicious opposition to the strategies of player A. Nature can be in
only one of the noted states, but in which one it is unknown, although in some cases only the
probabilities of these states may be known.

                                                                                                         (2)


                                                                                                         (3)

    Possible strategies A1, A2, ..., An of player A and his payoffs aij≥0 for each of the strategies and each
of the states of nature Pj are also known. These winnings can be shown in the form of a payoff matrix
(Table 1).

Table 1
Matrix of winnings
                                Pj
                                            P1                    P2             ...               Pn
                  Ai
                         А1                а11                    а12            ...               а1n
    (aij) =              А2                а21                    a22            ...               a2n
                         ...                ...                   ...            ...               ...
                         Аm                аm1                    am2            ...               amn
                         qj                 q1                    q2             ...               qn

    The bottom row of the matrix shows the probabilities qj of the states of nature Pj, j = 1, ..., n.
    Imagine that player A, not knowing the state of nature, chose strategy Ai. If nature has assumed the
state Pj, then the payoff of player А will be аij. But if player A knew in advance that nature would take
the state Pj, then he would choose the strategy Аi0, which achieves the greatest payoff ai0j.

                                                                                                         (4)
   Difference
                                                                                                          (5)

between the payoff j of player A under the known state of nature Pj and the payoff аij if the player A
does not know the state of nature, it is called the risk under the strategy Ai and the state of nature Pj.
Thus, the risk rij is that part of the greatest payoff j in the state of nature Pj which player A did not
win by applying strategy Ai through ignorance of the state of nature.

Table 2
Risk matrix
                                Pj
                                           P1                 P2                 ...                Pn
                 Аi
                        A1                 r11                r12                ...                r1n
    (rij) =             A2                 r21                r22                ...                r2n
                        ...                ...                ...                ...                ...
                        Am                 rm1                rm2                ...                rmn
                         qj                q1                 q2                 ...                qn

    The last line shows the probabilities of the states of nature qj, j = 1, …, n. Since 0 ≤ ai,j ≤ j (the
right inequality follows from (4)), then from (5) we obtain that 0 ≤ ri,j ≤ j .
    The probability qj of the state of nature Pj is obviously the probability of winning ai,j and risk rij for
each strategy Ai, i = 1,…, m. Therefore, each strategy Ai can be interpreted as a discrete random
variable, which can take values equal to the winnings ai1, … ,ain or risks ri1, …, rin with the
corresponding probabilities q1, …, qn.
    Player A's task is to choose the optimal strategy from the possible strategies Ai, ..., Am. The
optimality of a strategy is understood in various senses and is chosen according to various criteria.
    The result of the game generally depends on three numerical parameters: the payoffs a of player A,
the risks r that appear when player A chooses a particular strategy, and the probabilities q of states of
nature. The desire to “fold” these three parameters into one indicator leads to some numerical function
depending on these three parameters. Let's call it G (a, r, q) and call it the game function. The nature
of the dependence of the game function G on a, r and q is motivated by the logic of the applied
criterion. The values


                                                                                                          (6)

of the functions of the game will be called the indicators of the game. These indicators form the
matrix of the game (Table 3).

Table 2
Game matrix
                                Pj
                                            P1                 P2                 ...               Pn
                  Ai
                         A1                G11                 G12                ...              G1n
    (Gij) =              A2                G21                 G22                ...              G2n
                         ...                ...                ...                ...               ...
                         Am                Gm1                Gm2                 ...              Gmn
   The vector argument criterion  assumes the assignment of some numerical function

                                                                          ,                              (7)

whose value

                                                                                                         (8)

will be called the indicator of the strategy Ai.
   Then, among the indicators Gi of strategies Ai, an extreme one is selected. For some criteria, this is
the maximum value: Ext = max, and for others, the minimum: Ext = min. If Ext = max, then the
indicator Gi is called the indicator of the optimality of the strategy Ai; if Ext = min, then Gi is called
the non-optimality indicator of the strategy Ai.


                                                                                                         (9)

   Applying the described scheme, we will form some classes of criteria.

3. Results
   For maximum criteria (extreme pessimism).

                                                                                                        (10)

and indicators of strategies Ai are determined as follows:




                                                                                                        (11)

and are (10) indicators of the optimality of strategies.
    Thus, Gi is the worst indicator of the game under the strategy Ai. Hence it follows that the function
of the game G (a, r, q) should be non-decreasing in the payoff a and non-increasing in the risk r.
    The game performance is also influenced by the probabilities of states of nature q. So, for
example, if the worst smallest payoff аij for strategy Ai has a sufficiently small probability qj, then it is
no longer advisable to consider it as the smallest one. For this gain to remain practically the smallest,
it should have a sufficiently high probability. With risks, the opposite is true: for the worst, greatest
risk rij with strategy Ai to remain practically the greatest, its probability should also be large enough.
This suggests that the game function should not increase in probability q.
    So, the logic of the maximin criterion determines the behavior of the game function depending on
the payoff a, risk r and probability q:

                                           G (a, r, q) Ú by a; Ø by r; Ø by q                           (12)

   For convenience, in what follows, for the maximin criterion, we denote the game function G by W,
the indicators of the game Gij by Wij, and the optimality indicators Gi of strategies Ai by Wi.
   Thus, for the maximin criterion, the game function

                                           W (a, r, q) Ú by a; Ø by r; Ø by q,                          (13)
   Game performance is:


                                                                                                        (14)

   Strategy optimality indicators are


                                                                                                        (15)

   Optimal according to the maximin criterion is considered the strategy Ai0, for which



                                                                                                        (16)
    The maximin criterion is a criterion for the extreme pessimism of a person who chooses a strategy,
since it orients him to the worst manifestation of the state of nature for him and, as a consequence, to
very careful behavior when making a decision.
    The specific function of the game W (a, r, q) can be chosen in different ways, but with the
indispensable requirement of possessing properties (13).
    Examples of maximin criteria with specific functions of the game W (a, r, q) are the following
criteria:

                                                    W(a,r,q) = a;                                       (17)
                                                    W(a,r,q) = (1-q)a;                                  (18)
                                                    W(a,r,q) = a-r;                                     (19)
                                                    W(a,r,q) = (1-q)a-qr.                               (20)

    Each of these functions possesses properties (13), can be checked by the sign of the partial
derivatives.
    In criterion (17), the indicators of the game are the winnings: Wij=aij, and therefore it does not take
into account either the risks or the probabilities of the states of nature. Criterion (17) is Wald's
criterion allowing to justify the choice of a solution in conditions of complete uncertainty, in
conditions of ignorance of the probabilities of states of nature [24]. Criterion (18) takes into account
the gains and probabilities of states of nature, but does not take into account the risks. Criterion (19)
takes into account the gains and risks without considering the probabilities of states of nature.
Criterion (20) takes into account the gains, risks, and probabilities of states of nature.

    For the minimax criterion (extreme pessimism), we denote the game function by S (a, r, q). It
should be non-increasing in the payoff a and non-decreasing in the risk r and the probability q of the
states of nature:

                                           S (a, r, q) Ø by a; Ú by r; Ú by q                           (21)

   Then Sij = S (aij, rij, qj) are the indicators of the game. Strategy indicators are defined as follows:

                                                                                                        (22)



   Then Sij = S (aij, rij, qj) are the indicators of the game. Strategy indicators are defined as follows:
                                                                                                     (23)



   By virtue of (23), the indicators Si are indicators of the non-optimality of the strategies Ai. The
game function S (a, r, q) should have properties (21) in view of (22) and (23).
   Let us present some minimax criteria with specific functions of the game S (a, r, q) satisfying
conditions (21):

                                                   S(a,r,q) = r;                                     (24)
                                                   S(a,r,q) = qr;                                    (25)
                                                   S(a,r,q) = r-a;                                   (26)
                                                   S(a,r,q) = qr-(1-q)a.                             (27)

   Criterion (24), in which the indicators of the game are risks, does not take into account either the
gains or the probabilities of the states of nature. This is the Savage criterion.
   Comparing the maximin and minimax criteria, we can say the following.
   Statement 1. The maximin criteria (19) and (20) are equivalent to the minimax criteria (26) and
(27), respectively.
   The first of these equivalents means that strategy Ai is optimal according to criterion (19) if and
only if it is optimal according to criterion (26). A similar explanation applies to the second equivalent.
   Evidence. Let us first prove the equivalence (19) Û (26). Since the game functions W and S,
respectively, of criteria (19) and (26) satisfy the equality S = –W, then the game indicators also satisfy
the analogous equality Sij = –Wij. Then


                                                                                                     (28)

from where


                              1                                                                      (29)

   Thus, Si is minimal for the number i, for which Wi is maximal, and the equivalence (19) Û (26) is
proved. Then the equivalence (20) Û (27) is also proved.

    In case of maximax criteria (extreme optimism), the game function, which we denote by M (a, r,
q), should not decrease with respect to the payoff a and the probability q of states of nature and not
increase with respect to the risk r:

                                      M (a, r, q) Ú a; Ø by r; by Ú q.                               (30)

   Indicators of the game Mij = M (aij, rij, qj). Optimality indicators of strategies


                                                                                                     (31)

   An optimal strategy is a strategy Ai0 for which


                                                                                                     (32)
   The maximax criteria are criteria of extreme optimism, since they assume that nature will be in the
most favorable state for player A, and therefore the strategy is chosen as the optimal one, in which the
maximum indicator of the game – the indicator of optimality is maximum among the maximum
indicators of all strategies.
   As maximax criteria with specific functions of the game M (a, r, q) possessing properties (30), we
can take, for example, the following:

                                                   M(a, r, q) = а;                                    (33)
                                                   M(a, r, q) = qa;                                   (34)
                                                   M(a, r, q) = a-r;                                  (35)
                                                   M(a, r, q) =qa-(1-q)r.                             (36)

   In criterion (33), the indicators of the game are winnings Mij = aij.

    The function of the game in case of minimum criteria (extreme optimism), we define it through E
(a, r, q), is chosen non-increasing in terms of payoff, and also in terms of the probability q of states of
nature and non-decreasing in terms of risk r:

                                              E (a, r, q) Ø by a; Ú by r; Ø by q.                     (37)

   As indicators of non-optimal strategies Аi, we take


                                                                                                      (38)

where Eij = E (aij, rij, qi) are the indicators of the game.
  The optimal strategy is assigned to the strategy Ai0, which minimizes the non-optimal index Ei.


                                                                                                      (39)

    Minimum criteria are also criteria of extreme optimism, since an optimal strategy is understood as
a strategy in which the non-optimal indicator is the minimum among the non-optimal indicators of all
strategies.
    Examples of minimin criteria with functions of the game E (a, r, q) with properties (37) can be:

                                                   E(a, r, q) = r;                                    (40)
                                                   E(a, r, q) = (1–q)r;                               (41)
                                                   E(a, r, q) = r –a;                                 (42)
                                                   E(a, r, q) = (1–q)r –qa.                           (43)

    The indicators of play in criterion (40) are risks, and thus it turns into a minimum criterion for
risks.
    Statement 2. The maximax criteria (35) and (36) are equivalent to the minimum criterion (42) and
(43), respectively.
    The proof is similar to that of Statement 1, namely, for criteria (35) and (42) we have: E = –M and,
therefore, Eij = –Mij, whence


                                                                                                      (44)
therefore


                                                                                                       (45)

the equivalence of (35) Û (42) is proved.

    For better visibility (13), (21), (30), and (37) to the non-increasing or non-decreasing of the game
functions depending on the payoffs a, risks r, and states of nature q, let us summarize them in the
following table 4.

Table 4
Comparison of methods
   Arguments                                    Game functions and criteria
 Game functions       W(a, r, q)               S(a, r, q)         M(a, r, q)              E(a, r, q)
                      max min                  min max            max max                 min min
        a                Ú                        Ø                   Ú                      Ø
        r                Ø                        Ú                   Ø                      Ú
        q                Ø                        Ú                   Ú                      Ø

   It can be seen from this table that denoting the behavior of the game functions depending on the
winnings a correspond to the first icon in the name of the criterion: max - Ú, min - Ø, max - Ú, min -
Ø. And in the second line, indicating the behavior of the game functions depending on the risks r, are
opposite to the arrows in the first line.

4. Conclusion
   In this paper, the theory of the game and its types were described, the concept of games with
nature was also described. Methods of decision-making in conditions of uncertainty were described,
such as the Wald criterion, the optimism criterion, the pessimism criterion, the Savage criterion. For
each criterion, its exceptional feature was described.

5. References
[1] S. Jin-yu, F. Zhi-geng: Maximum entropy grey game model between human and nature.
    Proceedings of 2013 IEEE International Conference on Grey systems and Intelligent Services
    (GSIS) (2013) 436-439. doi: 10.1109/GSIS.2013.6714821.
[2] H. Shah, M. Gopal: Markov game based control: Worst case design strategies for games against
    nature. 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems
    (2010) 339-343. doi: 10.1109/ICICISYS.2010.5658687.
[3] S. Laskowski: Criteria of Choosing Strategy in Games Against Nature. EUROCON 2007 - The
    International Conference on "Computer as a Tool" (2007) 2323-2328, doi:
    10.1109/EURCON.2007.4400384.
[4] C. Grappiolo, et. al.: Towards Player Adaptivity in a Serious Game for Conflict Resolution.
     2011 Third International Conference on Games and Virtual Worlds for Serious Applications.
    (2011) 192-198. doi: 10.1109/VS-GAMES.2011.39.
[5] D. Chumachenko, I. Meniailov, K. Bazilevych, T. Chumachenko: On Intelligent Decision
    Making in Multiagent Systems in Conditions of Uncertainty. 2019 11th International Scientific
    and Practical Conference on Electronics and Information Technologies, ELIT 2019 –
    Proceedings. (2019) 150-153. doi: 10.1109/ELIT.2019.8892307
[6] P. Piletskiy P., et. al.: Development and Analysis of Intelligent Recommendation System Using
    Machine Learning Approach. Advances in Intelligent Systems and Computing 1113 (2020) 186-
    197. doi: 10.1007/978-3-030-37618-5_17
[7] N. Dotsenko, et. al.: Project-oriented management of adaptive teams' formation resources in
     multi-project environment. CEUR Workshop Proceedings 2353 (2019) 911-923.
[8] N. Dotsenko, et. al.: Modeling of the processes of stakeholder involvement in command
     management in a multi-project environment. 2018 IEEE 13th International Scientific and
     Technical Conference on Computer Sciences and Information Technologies, CSIT 2018 –
     Proceedings. 1 (2018) 29-32. doi: 10.1109/STC-CSIT.2018.8526613
[9] S. M. Lucas: Game AI Research with Fast Planet Wars Variants. 2018 IEEE Conference on
     Computational Intelligence and Games (CIG) (2018) 1-4, doi: 10.1109/CIG.2018.8490377.
[10] F. Yu, F. Chengcheng, S. Yuqiang: Data Analysis between Numerical Simulation and High
     Frequency Ground Wave Radar during a Gale Weather Process. 2019 International Conference
     on Meteorology Observations (ICMO) (2019) 1-4. doi: 10.1109/ICMO49322.2019.9025979.
[11] F. Xue, G. Yan, X. Zhou, S. Xu: Evolutionary Game Analysis of Green Building Promotion
     Mechanism Based on SD. 2019 International Conference on Economic Management and Model
     Engineering (ICEMME) (2019) 356-359. doi: 10.1109/ICEMME49371.2019.00077.
[12] M. Mazorchuck, et. al.: Web-application development for tasks of prediction in medical domain.
     2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and
     Information Technologies, CSIT 2018 – Proceedings. 1 (2018) 5-8. doi: 10.1109/STC-
     CSIT.2018.8526684
[13] D. Chumachenko, O. Sokolov, S. Yakovlev: Fuzzy recurrent mappings in multiagent simulation
     of population dynamics systems. International Journal of Computing 19 (2) (2020) 290-297.
[14] L. Huang, H. Chen: Simulation and Analysis of Centralized Bidding Market Clearing Method
     Based on Intelligent Algorithm. 2019 IEEE Innovative Smart Grid Technologies - Asia (ISGT
     Asia) (2019) 2963-2967. doi: 10.1109/ISGT-Asia.2019.8881105.
[15] F. Wang, X. Feng, Lu Tang: Microeconomic Modeling and Simulation of Exchange Rate with
     Heterogeneous Strategies. 2007 International Conference on Machine Learning and Cybernetics.
     (2007) 2351-2356. doi: 10.1109/ICMLC.2007.4370538.
[16] K. Bazilevych, et. al.: Stochastic modelling of cash flow for personal insurance fund using the
     cloud data storage. International Journal of Computing 17 (3) (2018) 153-162.
[17] B. Pittl, W. Mach, E. Schikuta: CloudTax: A CloudSim-Extension for Simulating Tax Systems
     on Cloud Markets. 2016 IEEE International Conference on Cloud Computing Technology and
     Science (CloudCom) (2016) 35-42. doi: 10.1109/CloudCom.2016.0021.
[18] D. Chumachenko, et. al. On-Line Data Processing, Simulation and Forecasting of the
     Coronavirus Disease (COVID-19) Propagation in Ukraine Based on Machine Learning
     Approach, Communications in Computer and Information Science 1158 (2020) 372-382. doi:
     10.1007/978-3-030-61656-4_25
[19] S. Yakovlev, et. al., The concept of developing a decision support system for the epidemic
     morbidity control, CEUR Workshop Proceedings 2753 (2020) 265–274.
[20] J. Tomalá-Gonzáles, et. al.: Serious Games: Review of methodologies and Games engines for
     their development. 2020 15th Iberian Conference on Information Systems and Technologies
     (CISTI) (2020) 1-6. doi: 10.23919/CISTI49556.2020.9140827.
[21] A. A. Rafik: Decision making theory with imprecise probabilities. 2009 Fifth International
     Conference on Soft Computing, Computing with Words and Perceptions in System Analysis,
     Decision and Control (2009) 1-1. doi: 10.1109/ICSCCW.2009.5379425.
[22] A. Alothman, A. Alqahtani: Analyzing Competitive Firms In An Oligopoly Market Structure
     Using Game Theory. 2020 Industrial & Systems Engineering Conference (ISEC). (2020) 1-5.
     doi: 10.1109/ISEC49495.2020.9230335.
[23] Y. Pan: Optimization of Investment, Consumption and Proportional Reinsurance with Model
     Uncertainty. 2020 Chinese Control And Decision Conference (CCDC) (2020) 826-831. doi:
     10.1109/CCDC49329.2020.9164859.
[24] J. Liu, M. Gao, J. Zheng, J. Wang: Model-Based Wald Test for Adaptive Range-Spread Target
     Detection. IEEE Access, vol. 8, pp. 73259-73267, 2020, doi: 10.1109/ACCESS.2020.2988066.