=Paper=
{{Paper
|id=Vol-3003/short7
|storemode=property
|title=Intelligent Decision-Making in Conditions of Uncertainty in Games with Nature
|pdfUrl=https://ceur-ws.org/Vol-3003/short7.pdf
|volume=Vol-3003
|authors=Pavlo Pyrohov,Ievgen Meniailov
|dblpUrl=https://dblp.org/rec/conf/profitai/PyrohovM21
}}
==Intelligent Decision-Making in Conditions of Uncertainty in Games with Nature==
Intelligent Decision-Making in Conditions of Uncertainty in Games with Nature Pavlo Pyrohov, Ievgen Meniailov National Aerospace University “Kharkiv Aviation Institute”, Chkalow str., 17, Kharkiv, Ukraine Abstract Paper represents games with nature. The theory of the game with nature and it’s concept is described. Artificial intelligence methods are applied to develop methods of decision-making in conditions of uncertainty. The Wald criterion, the optimism criterion, the pessimism criterion and the Savage criterions are described. The comparison of developed methods has been done. Denoting the behavior of the game functions depends on the winnings a corresponds to the first icon in the name of the criterion. Keywords 1 Game with nature, artificial intelligence, decision-making, conditions of uncertainty. 1. Introduction Games with nature are mathematical models for which the choice of a decision depends on objective reality. For example, customer demand, the state of nature, etc. [1-2]. “Nature” is a generalized concept of an adversary who does not pursue his own goals in a given conflict [3]. To apply this theory, it is necessary to be able to represent conflicts in the form of games [4]. A characteristic feature of any conflict is that none of the parties involved knows in advance exactly and completely all their possible solutions, as well as the other parties, their future behavior and, therefore, each is forced to act in conditions of uncertainty [5]. The uncertainty of the outcome can be due to both the conscious actions of active opponents and unconscious, passive manifestations, for example, of the elemental forces of nature: rain, sun, wind, avalanche, etc. In such cases, the possibility of an accurate prediction of the outcome is excluded [6]. In some conflicts, the opposite side is a consciously and purposefully acting active adversary who is interested in our defeat, who deliberately prevents success, and achieves victory by any means [7-8]. In other conflicts, there is no such a conscious enemy, but only the so-called "blind forces of nature" operate [9]: weather conditions [10], the state of trade equipment at the enterprise [11], illness of employees [12], the instability of the economic situation [13], market conditions [14], the dynamics of exchange rates [15], the level of inflation [16], tax policy [17], changing purchasing demand, etc. With global pandemic of COVID-19 [18] games with nature can be used for decision-making in preventing and control measures to eliminate the epidemic dynamics [19]. In such cases, nature is not malicious and acts passively, sometimes to the detriment of man, and sometimes to his benefit, but her state and manifestation can significantly affect the result of the activity [20]. In such games, a person tries to act prudently, for example, using a strategy that allows you to get the least loss. The second player (nature) acts unintentionally, completely by accident, his possible strategies are known (nature's strategies). Such situations are investigated using the theory of statistical decisions [21]. Although there may well be situations in which nature can really act as a player. For example, circumstances associated with weather conditions or with natural elemental forces. Man's play with nature also reflects a conflict situation that arises when interests clash in International Workshop of IT-professionals on Artificial Intelligence (ProfIT AI 2021), September 20-21, 2021, Kharkiv, Ukraine EMAIL: pavel.vogorip@gmail.com (P. Pyrohov); evgenii.menyailov@gmail.com (I. Meniailov). ORCID: 0000-0002-6100-4406 (P. Pyrohov); 0000-0002-9440-8378 (I. Meniailov). ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) choosing a solution. But “the elemental forces of nature” cannot be attributed to reasonable actions directed against a person, and even more so any “malicious intent” [22]. Thus, it is more correct to talk about a conflict situation caused by a clash of human interests and the uncertainty of nature's actions, but without an obvious antagonistic coloration [23]. Situations in which the risk is associated not with the conscious opposition of the opposite side (environment), but with insufficient awareness of its behavior or the state of the decision-maker, are investigated using the theory of statistical decisions. The aim of research is to investigate methods of decision-making in conditions of uncertainty in games with nature. 2. Materials and methods Matrix of playing with nature: А = || аij || (1) where аij is the payoff of player 1 in the implementation of his pure strategy i and pure strategy j of player 2 (nature) (i = 1, ..., m; j = 1, …, n). All possible states are considered as P1, P2, ..., Pn of nature P, which it calls randomly regardless of the actions of player A without malicious opposition to the strategies of player A. Nature can be in only one of the noted states, but in which one it is unknown, although in some cases only the probabilities of these states may be known. (2) (3) Possible strategies A1, A2, ..., An of player A and his payoffs aij≥0 for each of the strategies and each of the states of nature Pj are also known. These winnings can be shown in the form of a payoff matrix (Table 1). Table 1 Matrix of winnings Pj P1 P2 ... Pn Ai А1 а11 а12 ... а1n (aij) = А2 а21 a22 ... a2n ... ... ... ... ... Аm аm1 am2 ... amn qj q1 q2 ... qn The bottom row of the matrix shows the probabilities qj of the states of nature Pj, j = 1, ..., n. Imagine that player A, not knowing the state of nature, chose strategy Ai. If nature has assumed the state Pj, then the payoff of player А will be аij. But if player A knew in advance that nature would take the state Pj, then he would choose the strategy Аi0, which achieves the greatest payoff ai0j. (4) Difference (5) between the payoff j of player A under the known state of nature Pj and the payoff аij if the player A does not know the state of nature, it is called the risk under the strategy Ai and the state of nature Pj. Thus, the risk rij is that part of the greatest payoff j in the state of nature Pj which player A did not win by applying strategy Ai through ignorance of the state of nature. Table 2 Risk matrix Pj P1 P2 ... Pn Аi A1 r11 r12 ... r1n (rij) = A2 r21 r22 ... r2n ... ... ... ... ... Am rm1 rm2 ... rmn qj q1 q2 ... qn The last line shows the probabilities of the states of nature qj, j = 1, …, n. Since 0 ≤ ai,j ≤ j (the right inequality follows from (4)), then from (5) we obtain that 0 ≤ ri,j ≤ j . The probability qj of the state of nature Pj is obviously the probability of winning ai,j and risk rij for each strategy Ai, i = 1,…, m. Therefore, each strategy Ai can be interpreted as a discrete random variable, which can take values equal to the winnings ai1, … ,ain or risks ri1, …, rin with the corresponding probabilities q1, …, qn. Player A's task is to choose the optimal strategy from the possible strategies Ai, ..., Am. The optimality of a strategy is understood in various senses and is chosen according to various criteria. The result of the game generally depends on three numerical parameters: the payoffs a of player A, the risks r that appear when player A chooses a particular strategy, and the probabilities q of states of nature. The desire to “fold” these three parameters into one indicator leads to some numerical function depending on these three parameters. Let's call it G (a, r, q) and call it the game function. The nature of the dependence of the game function G on a, r and q is motivated by the logic of the applied criterion. The values (6) of the functions of the game will be called the indicators of the game. These indicators form the matrix of the game (Table 3). Table 2 Game matrix Pj P1 P2 ... Pn Ai A1 G11 G12 ... G1n (Gij) = A2 G21 G22 ... G2n ... ... ... ... ... Am Gm1 Gm2 ... Gmn The vector argument criterion assumes the assignment of some numerical function , (7) whose value (8) will be called the indicator of the strategy Ai. Then, among the indicators Gi of strategies Ai, an extreme one is selected. For some criteria, this is the maximum value: Ext = max, and for others, the minimum: Ext = min. If Ext = max, then the indicator Gi is called the indicator of the optimality of the strategy Ai; if Ext = min, then Gi is called the non-optimality indicator of the strategy Ai. (9) Applying the described scheme, we will form some classes of criteria. 3. Results For maximum criteria (extreme pessimism). (10) and indicators of strategies Ai are determined as follows: (11) and are (10) indicators of the optimality of strategies. Thus, Gi is the worst indicator of the game under the strategy Ai. Hence it follows that the function of the game G (a, r, q) should be non-decreasing in the payoff a and non-increasing in the risk r. The game performance is also influenced by the probabilities of states of nature q. So, for example, if the worst smallest payoff аij for strategy Ai has a sufficiently small probability qj, then it is no longer advisable to consider it as the smallest one. For this gain to remain practically the smallest, it should have a sufficiently high probability. With risks, the opposite is true: for the worst, greatest risk rij with strategy Ai to remain practically the greatest, its probability should also be large enough. This suggests that the game function should not increase in probability q. So, the logic of the maximin criterion determines the behavior of the game function depending on the payoff a, risk r and probability q: G (a, r, q) Ú by a; Ø by r; Ø by q (12) For convenience, in what follows, for the maximin criterion, we denote the game function G by W, the indicators of the game Gij by Wij, and the optimality indicators Gi of strategies Ai by Wi. Thus, for the maximin criterion, the game function W (a, r, q) Ú by a; Ø by r; Ø by q, (13) Game performance is: (14) Strategy optimality indicators are (15) Optimal according to the maximin criterion is considered the strategy Ai0, for which (16) The maximin criterion is a criterion for the extreme pessimism of a person who chooses a strategy, since it orients him to the worst manifestation of the state of nature for him and, as a consequence, to very careful behavior when making a decision. The specific function of the game W (a, r, q) can be chosen in different ways, but with the indispensable requirement of possessing properties (13). Examples of maximin criteria with specific functions of the game W (a, r, q) are the following criteria: W(a,r,q) = a; (17) W(a,r,q) = (1-q)a; (18) W(a,r,q) = a-r; (19) W(a,r,q) = (1-q)a-qr. (20) Each of these functions possesses properties (13), can be checked by the sign of the partial derivatives. In criterion (17), the indicators of the game are the winnings: Wij=aij, and therefore it does not take into account either the risks or the probabilities of the states of nature. Criterion (17) is Wald's criterion allowing to justify the choice of a solution in conditions of complete uncertainty, in conditions of ignorance of the probabilities of states of nature [24]. Criterion (18) takes into account the gains and probabilities of states of nature, but does not take into account the risks. Criterion (19) takes into account the gains and risks without considering the probabilities of states of nature. Criterion (20) takes into account the gains, risks, and probabilities of states of nature. For the minimax criterion (extreme pessimism), we denote the game function by S (a, r, q). It should be non-increasing in the payoff a and non-decreasing in the risk r and the probability q of the states of nature: S (a, r, q) Ø by a; Ú by r; Ú by q (21) Then Sij = S (aij, rij, qj) are the indicators of the game. Strategy indicators are defined as follows: (22) Then Sij = S (aij, rij, qj) are the indicators of the game. Strategy indicators are defined as follows: (23) By virtue of (23), the indicators Si are indicators of the non-optimality of the strategies Ai. The game function S (a, r, q) should have properties (21) in view of (22) and (23). Let us present some minimax criteria with specific functions of the game S (a, r, q) satisfying conditions (21): S(a,r,q) = r; (24) S(a,r,q) = qr; (25) S(a,r,q) = r-a; (26) S(a,r,q) = qr-(1-q)a. (27) Criterion (24), in which the indicators of the game are risks, does not take into account either the gains or the probabilities of the states of nature. This is the Savage criterion. Comparing the maximin and minimax criteria, we can say the following. Statement 1. The maximin criteria (19) and (20) are equivalent to the minimax criteria (26) and (27), respectively. The first of these equivalents means that strategy Ai is optimal according to criterion (19) if and only if it is optimal according to criterion (26). A similar explanation applies to the second equivalent. Evidence. Let us first prove the equivalence (19) Û (26). Since the game functions W and S, respectively, of criteria (19) and (26) satisfy the equality S = –W, then the game indicators also satisfy the analogous equality Sij = –Wij. Then (28) from where 1 (29) Thus, Si is minimal for the number i, for which Wi is maximal, and the equivalence (19) Û (26) is proved. Then the equivalence (20) Û (27) is also proved. In case of maximax criteria (extreme optimism), the game function, which we denote by M (a, r, q), should not decrease with respect to the payoff a and the probability q of states of nature and not increase with respect to the risk r: M (a, r, q) Ú a; Ø by r; by Ú q. (30) Indicators of the game Mij = M (aij, rij, qj). Optimality indicators of strategies (31) An optimal strategy is a strategy Ai0 for which (32) The maximax criteria are criteria of extreme optimism, since they assume that nature will be in the most favorable state for player A, and therefore the strategy is chosen as the optimal one, in which the maximum indicator of the game – the indicator of optimality is maximum among the maximum indicators of all strategies. As maximax criteria with specific functions of the game M (a, r, q) possessing properties (30), we can take, for example, the following: M(a, r, q) = а; (33) M(a, r, q) = qa; (34) M(a, r, q) = a-r; (35) M(a, r, q) =qa-(1-q)r. (36) In criterion (33), the indicators of the game are winnings Mij = aij. The function of the game in case of minimum criteria (extreme optimism), we define it through E (a, r, q), is chosen non-increasing in terms of payoff, and also in terms of the probability q of states of nature and non-decreasing in terms of risk r: E (a, r, q) Ø by a; Ú by r; Ø by q. (37) As indicators of non-optimal strategies Аi, we take (38) where Eij = E (aij, rij, qi) are the indicators of the game. The optimal strategy is assigned to the strategy Ai0, which minimizes the non-optimal index Ei. (39) Minimum criteria are also criteria of extreme optimism, since an optimal strategy is understood as a strategy in which the non-optimal indicator is the minimum among the non-optimal indicators of all strategies. Examples of minimin criteria with functions of the game E (a, r, q) with properties (37) can be: E(a, r, q) = r; (40) E(a, r, q) = (1–q)r; (41) E(a, r, q) = r –a; (42) E(a, r, q) = (1–q)r –qa. (43) The indicators of play in criterion (40) are risks, and thus it turns into a minimum criterion for risks. Statement 2. The maximax criteria (35) and (36) are equivalent to the minimum criterion (42) and (43), respectively. The proof is similar to that of Statement 1, namely, for criteria (35) and (42) we have: E = –M and, therefore, Eij = –Mij, whence (44) therefore (45) the equivalence of (35) Û (42) is proved. For better visibility (13), (21), (30), and (37) to the non-increasing or non-decreasing of the game functions depending on the payoffs a, risks r, and states of nature q, let us summarize them in the following table 4. Table 4 Comparison of methods Arguments Game functions and criteria Game functions W(a, r, q) S(a, r, q) M(a, r, q) E(a, r, q) max min min max max max min min a Ú Ø Ú Ø r Ø Ú Ø Ú q Ø Ú Ú Ø It can be seen from this table that denoting the behavior of the game functions depending on the winnings a correspond to the first icon in the name of the criterion: max - Ú, min - Ø, max - Ú, min - Ø. And in the second line, indicating the behavior of the game functions depending on the risks r, are opposite to the arrows in the first line. 4. Conclusion In this paper, the theory of the game and its types were described, the concept of games with nature was also described. Methods of decision-making in conditions of uncertainty were described, such as the Wald criterion, the optimism criterion, the pessimism criterion, the Savage criterion. For each criterion, its exceptional feature was described. 5. References [1] S. Jin-yu, F. Zhi-geng: Maximum entropy grey game model between human and nature. Proceedings of 2013 IEEE International Conference on Grey systems and Intelligent Services (GSIS) (2013) 436-439. doi: 10.1109/GSIS.2013.6714821. [2] H. Shah, M. Gopal: Markov game based control: Worst case design strategies for games against nature. 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems (2010) 339-343. doi: 10.1109/ICICISYS.2010.5658687. [3] S. Laskowski: Criteria of Choosing Strategy in Games Against Nature. EUROCON 2007 - The International Conference on "Computer as a Tool" (2007) 2323-2328, doi: 10.1109/EURCON.2007.4400384. [4] C. Grappiolo, et. al.: Towards Player Adaptivity in a Serious Game for Conflict Resolution. 2011 Third International Conference on Games and Virtual Worlds for Serious Applications. (2011) 192-198. doi: 10.1109/VS-GAMES.2011.39. [5] D. Chumachenko, I. Meniailov, K. Bazilevych, T. Chumachenko: On Intelligent Decision Making in Multiagent Systems in Conditions of Uncertainty. 2019 11th International Scientific and Practical Conference on Electronics and Information Technologies, ELIT 2019 – Proceedings. (2019) 150-153. doi: 10.1109/ELIT.2019.8892307 [6] P. Piletskiy P., et. al.: Development and Analysis of Intelligent Recommendation System Using Machine Learning Approach. Advances in Intelligent Systems and Computing 1113 (2020) 186- 197. doi: 10.1007/978-3-030-37618-5_17 [7] N. Dotsenko, et. al.: Project-oriented management of adaptive teams' formation resources in multi-project environment. CEUR Workshop Proceedings 2353 (2019) 911-923. [8] N. Dotsenko, et. al.: Modeling of the processes of stakeholder involvement in command management in a multi-project environment. 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2018 – Proceedings. 1 (2018) 29-32. doi: 10.1109/STC-CSIT.2018.8526613 [9] S. M. Lucas: Game AI Research with Fast Planet Wars Variants. 2018 IEEE Conference on Computational Intelligence and Games (CIG) (2018) 1-4, doi: 10.1109/CIG.2018.8490377. [10] F. Yu, F. Chengcheng, S. Yuqiang: Data Analysis between Numerical Simulation and High Frequency Ground Wave Radar during a Gale Weather Process. 2019 International Conference on Meteorology Observations (ICMO) (2019) 1-4. doi: 10.1109/ICMO49322.2019.9025979. [11] F. Xue, G. Yan, X. Zhou, S. Xu: Evolutionary Game Analysis of Green Building Promotion Mechanism Based on SD. 2019 International Conference on Economic Management and Model Engineering (ICEMME) (2019) 356-359. doi: 10.1109/ICEMME49371.2019.00077. [12] M. Mazorchuck, et. al.: Web-application development for tasks of prediction in medical domain. 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2018 – Proceedings. 1 (2018) 5-8. doi: 10.1109/STC- CSIT.2018.8526684 [13] D. Chumachenko, O. Sokolov, S. Yakovlev: Fuzzy recurrent mappings in multiagent simulation of population dynamics systems. International Journal of Computing 19 (2) (2020) 290-297. [14] L. Huang, H. Chen: Simulation and Analysis of Centralized Bidding Market Clearing Method Based on Intelligent Algorithm. 2019 IEEE Innovative Smart Grid Technologies - Asia (ISGT Asia) (2019) 2963-2967. doi: 10.1109/ISGT-Asia.2019.8881105. [15] F. Wang, X. Feng, Lu Tang: Microeconomic Modeling and Simulation of Exchange Rate with Heterogeneous Strategies. 2007 International Conference on Machine Learning and Cybernetics. (2007) 2351-2356. doi: 10.1109/ICMLC.2007.4370538. [16] K. Bazilevych, et. al.: Stochastic modelling of cash flow for personal insurance fund using the cloud data storage. International Journal of Computing 17 (3) (2018) 153-162. [17] B. Pittl, W. Mach, E. Schikuta: CloudTax: A CloudSim-Extension for Simulating Tax Systems on Cloud Markets. 2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom) (2016) 35-42. doi: 10.1109/CloudCom.2016.0021. [18] D. Chumachenko, et. al. On-Line Data Processing, Simulation and Forecasting of the Coronavirus Disease (COVID-19) Propagation in Ukraine Based on Machine Learning Approach, Communications in Computer and Information Science 1158 (2020) 372-382. doi: 10.1007/978-3-030-61656-4_25 [19] S. Yakovlev, et. al., The concept of developing a decision support system for the epidemic morbidity control, CEUR Workshop Proceedings 2753 (2020) 265–274. [20] J. Tomalá-Gonzáles, et. al.: Serious Games: Review of methodologies and Games engines for their development. 2020 15th Iberian Conference on Information Systems and Technologies (CISTI) (2020) 1-6. doi: 10.23919/CISTI49556.2020.9140827. [21] A. A. Rafik: Decision making theory with imprecise probabilities. 2009 Fifth International Conference on Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control (2009) 1-1. doi: 10.1109/ICSCCW.2009.5379425. [22] A. Alothman, A. Alqahtani: Analyzing Competitive Firms In An Oligopoly Market Structure Using Game Theory. 2020 Industrial & Systems Engineering Conference (ISEC). (2020) 1-5. doi: 10.1109/ISEC49495.2020.9230335. [23] Y. Pan: Optimization of Investment, Consumption and Proportional Reinsurance with Model Uncertainty. 2020 Chinese Control And Decision Conference (CCDC) (2020) 826-831. doi: 10.1109/CCDC49329.2020.9164859. [24] J. Liu, M. Gao, J. Zheng, J. Wang: Model-Based Wald Test for Adaptive Range-Spread Target Detection. IEEE Access, vol. 8, pp. 73259-73267, 2020, doi: 10.1109/ACCESS.2020.2988066.