Method “Mean – Risk” for Comparing Poly-Interval Objects in Intelligent Systems Gennady Shepelev 1[0000-0002-1037-8977], Nina Khairova 2[0000-0002-9826-0286], Zoia Kochueva 2[0000-0002-4300-3370] 1 Institute for Systems Analysis of Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, Moscow, Russia 2 National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine gis@isa.ru, khairova@kpi.kharkov.ua,kochueva@kochuev.com Abstract. Problems of comparing poly-interval alternatives under risk in the framework of intelligent computer systems are considered. The problems are common in economy, engineering and in other domains. “Mean-risk” approach was chosen as a tool for comparing. Method for calculation of both main indi- cators of the “mean-risk” approach – mean and semideviation - for case of poly- interval alternatives is proposed. Method permits to calculate mentioned indica- tors for interval alternatives represented as fuzzy objects and as generalized in- terval estimates. Keywords: interval alternatives, risk estimating techniques, “mean-risk” ap- proach, fuzzy poly-interval objects, generalized interval estimates 1. Introduction Problems of comparing alternatives on the effectiveness of achieving some goals play an important role in studying and using of intelligent computer systems as essential part of artificial intelligence research. A lot of practical tasks are analyzed under un- certainty and parameters of the tasks, if they are measured in numerical scales, receive due to uncertainty interval representations. But in future, after completion of the rele- vant project, when uncertainty will be removed, interval estimates will have certain one- numeric (point) values. It is supposed further at the paper that mentioned interval estimates of task parame- ters contain all possible, but with different chances, their point implementations. This assumption is controversial. Its non-faultlessness is overcome in mathematical statis- tics by specifying, together with interval estimate, the chances, which are determined by sampling data, that such estimate contains an unknown point value of the analyzed quantity. When parameters of tasks are defined on the basis of primarily expert judg- ments that are not supported by the statistics of the required volume, they try to over- come this difficulty by switching from mono-interval to poly-interval estimates. The need for a poly interval approach is also caused by the fact that in some cases it is difficult for an expert to express her/his knowledge about the analyzed parameter through a single interval estimate: an excessive range of estimate reduces the useful- ness of expert knowledge, and the narrow interval often leads to prediction errors. It is therefore advisable to give an expert the opportunity not to be limited only to mono- interval estimates, but to allow the expert to express own knowledge about the param- eters of the task with a set of intervals characterizing the uncertainty of expert knowledge concerning the length and position of the estimate interval of each charac- teristic of the task. If, as it is common in practice, it is required to compare the available options of the problems solutions (alternatives) by their effectiveness to achieve the goals set, some resulting indicators of the problems can, for meaningful reasons, be translated into the category of quality indicators i.e. into comparison criteria. One will call at the paper alternatives with interval quality indicators how interval alternatives (IAs). One can see that the problems of IAs comparing have the inherent risk of making the wrong decision during choosing an effectiveness object. Indeed, already for two compared IAs with a non-zero intersection of interval estimates of their quality indi- cators, any alternative may be effective in the future, although with varying degrees of confidence in the truth of this statement. Thus, the tasks of comparing IAs by their effectiveness are at least two-criterial: along with the criterion associated with the quality indicator of the alternative and evaluating the alternative by its preference it is necessary to take into account on an equal basis the indicator characterizing the risk of making the wrong decision about choosing the "best" alternative. It should be noted that there are risks of different nature in the area of comparing IAs by effectiveness. Among them, firstly, the risk as the possibility of obtaining losses or the possibility of obtaining a real outcome that differs from the desired pre- dicted result; and, secondly, the risk that an IA, which seems like effective in their presented set at the time of comparison, will not actually be such at the time of re- moving the uncertainty. In reality some another alternative may be effective. To eval- uate both the preference of an IA and the risk of making decision about preference the most common using distribution functions tools, similar to the probability theory. At the present there are two main approaches to evaluate values of measures of preference and risk. In the framework of the first method, “mean-risk” method [1], compared alternatives are treated as isolated, non-interacting objects. The value of the mathematical expectation of a random variable given on the interval estimate of the quality indicator is the criterion of preference in this method. Such indicators as vari- ance, left and right semi-variance, mean semi-deviation and others can be treated as the risk criterion. In the framework of the second method, method of “collective risk estimating” [2], IAs are treated as interacting objects, which form a system of com- pared objects. Here calculated risks depend on the number of compared objects, the value of risk is increased with the number of objects. At present, there are no approaches or methods for solving the problems of choos- ing the best interval alternatives that are superior to others in terms of the quality of the recommendations. Each method and approach has its advantages and disad- vantages, different interpretations of risk and methods for calculating their indicators complement each other. If the problem of comparing mono-intervals can be to a cer- tain extent considered as resolved [3-5], for poly-intervals this is not so. The purpose of this paper is to extend the “mean-risk” method as the first stage of studying the problem of comparing IAs to the case of poly interval estimates. This is essential since the information, which is necessary for comparison, can be obtained within the framework of both mono and poly interval pictures. It is advisable for comparability to obtain the results of a comparison using the same method. A study of the method of collective risk estimating for this case will be carried out later. 2. The main features of the method "mean-risk" Deviations from the desired value of the quality indicator to the better side are diffi- cult to associate with risk. Therefore, starting from a long-standing paper [6], risk is associated with indicators predicting the possibility of deviations for the worse (downside risk measures), the risk of getting losses. This concept was developed in articles [1, 7], and after this, as the results of papers [8, 9] showed, risk measures of losses became the de facto standard in the risk management. Now indicator of the mean semi-deviation SI [10, 11], which shows the value of the average deviation of a random variable given on the interval estimate [L, R] from its mathematical expecta- tion E, took the central place among these risk measures. It is customary to distinguish between left SIl and right SIr semi-deviation indica- tors. If f(x) is the distribution density of a random variable X, then, by definition, E R S Il   ( E  x) f ( x)dx, S Ir   ( x  E ) f ( x)dx. (1) L E One can be shown that SIl = SIr = SI for any, not necessarily symmetric, distribution. Indeed, since R E R  ( E  x) f ( x)dx   ( E  x) f ( x)dx   ( x  E ) f ( x)dx  0, (2) L L E then SIl = SIr = SI. Conveniently, that SI have dimension of E. Thus, the mathematical expectation E of the random variable X, given in the inter- val estimate of the quality indicator of the IA, is chosen in the considered version of the “mean – risk” method as a measure of preference, and the mean semi-deviation as a risk measure. The behavior of SI for a uniform and triangular distribution of chances on mono in- tervals is studied in [12]. The uniform distribution implements the principle of maxi- mum entropy in the absence of any a priori information about the quantity under study. The triangular distribution is quite often used by experts in their practical work, serving for them as a simple approximation of other unimodal distributions. Let us turn to the study of the properties of this method for poly-interval alternatives. Two main directions can be distinguished within the poly-interval approach: the description by means of the apparatus of fuzzy sets [13] and using the formalism of generalized interval estimations (GIE) [14]. Consider first the case of fuzzy poly- interval objects. 3. Method "mean - risk" for poly-interval alternatives: fuzzy objects Suppose firstly that we work with the most common in practice triangular member- ship functions of fuzzy objects. The use of indicators of preference (mean) and risk (mean semi-deviation) for the method “mean-risk” requires now clarification. The simplest way to obtain the desired indicators, in which the membership function is treated as the density of the distribution function, hardly can be considered as reason- able. The point of view [15], according to which various numerical characteristics associated with fuzzy objects and similar to the corresponding characteristics of prob- ability theory, ought to have interval values, it seems more consistent. Since it is de- sirable to communicate with expert practitioners in their usual language, it is impera- tive to move from such interval values to one-numeric estimates that characterize former. Defuzzification procedures will be used for such transition. The triangular membership function is given by a triple (L, T, R) such that L < T < >T. We will use two methods for defuzzification of E(α). In the first of these all I(α) are considered on a parity basis in obtaining an one-numeric characteristic EF, and in the second, in the center of gravity method, the contribution I(α) to the integral one- numeric characteristic EF1 is changed with changing α. Then 1 1 E F   dE ( ), E F 1 2 dE ( ). (4) 0 0 Here 2 is the normalizing factor. Then EF = (EU + T)/2, EF1 = (EU + 2T)/3. One can see that EF is greater than EF1 for EU > T, less otherwise and coincides for EU = T. One-numeric characteristics for indicators of the mean semi-deviations can be ob- tained similarly. We have for each interval estimate on the α-level for the mean semi- deviation Sl(α): E ( ) S l ( )   dx[ E ( )  x] / ( )  [ E ( )  L( )]2 / 2( )  L ( ) (5)  (1   )(EU  L) 2 / 2( R  L), where Δ(α) = R(α) – L(α). It can be seen that the possible values of Sl(α) lie in the interval [0, (R – L)/8]. The defuzzification by the first method now gives for the indicator of the mean semi- deviation SlF= (R – L)/16, and by the center of gravity method SlF1 = (R – L)/24, i.e. SlF = 3SlF1/2. Thus, if EF can be either greater than EF1 or less than EF1, then SlF is always greater than SlF1. Moreover, if both EF and EF1 depend on the position of the vertex T of the membership function, then the values of the indicators of the mean semi-deviation do not depend on it. The proposed method for constructing one-numeric estimates for interval fuzzy values can also be applied to other one-numeric characteristics, for example, for a one-numeric estimate of an analogue of variance of a random variable. So in the case of triangular membership function we have for variance Var(α) on an α-level: R ( ) [ x  E ( )]2 dx (1   ) 2 ( R  L) 2 Var ( )    . (6) L ( ) ( ) 12 Thus, the possible values of the variance lie in the interval [0, (R – L)2/12]. Defuzzifi- cation by the first method gives then for one-numeric variance estimate VarF: VarF = =(R – L)2/36, and by the center of gravity method VarF1 = (R – L)2/72. So VarF = =2VarF1. This method of constructing one-numeric estimates for interval fuzzy values can be used for other types of membership functions, in particular, for trapezoidal. The trap- ezoidal membership function is given by a quadruple (L, T1, T2, R) such that L < 0 and less otherwise. Just as above, for the mean semi-deviations Sl(α) on α-levels we obtain that their possible values lie in the interval [(T2 – T1)/8, (R – L)/8], and for one-numeric esti- mates SlF and SlF1 we have SlF = (R – L + T2 – T1)/16, SlF1 = [R – L + 2(T2 – T1)]/24. Here the values of the indicators of the mean semi-deviation depend on the position of the upper corner points of the membership function. The sign of the difference SlF and SlF1 is determined by the sign of the value R – L + T2 – T1, and therefore here, as for the triangular membership functions, SlF is always greater than SlF1. As above for the variance Var(α) on the α-level we have: R ( ) [ x  E ( )]2 dx [(1   )(R  L)   (T2  T1 )]2 Var ( )    . (8) L ( ) ( ) 12 Here the possible values of the variance lie in the interval [(T2 – T1)2/12, (R – L)2/12]. With the first method of defuzzification for one-numeric estimate of the variance VarF we then get VarF = (T2 – T1 + R – L)2/36, and for defuzzification with the center of gravity method for one-numeric estimate of the variance VarF1 we have: VarF1 = [(R – – L)2 + (R – L)(T2 – T1) + 3(T2 – T1)2]/72. It can be shown that VarF> VarF1. Note that all relations obtained for the trapezoidal case pass into the corresponding relations for the triangular membership function at T1 = T2 = T. 4. Method "mean - risk" for poly-interval alternatives: general interval estimations The approach of general estimations (GIE) is a direct extension of the mono-interval approach to the poly-interval case. In the first of them the initial point estimate of the analyzed parameter, to account for the uncertainty of knowledge about it, is “blurred”, not necessarily symmetrically, filling in a certain interval of possible values of the parameter. To describe the chances of realizing possible point implementations x of the parameter, the apparatus of distribution functions is used. The distribution func- tion is specified on the interval-carrier by the density of the chance distribution func- tion f(x). In the GIE approach the initial estimate is the interval Iu = [Lu, Ru] and it is already blurred, again not necessarily symmetrically, generating, as the final parame- ter estimate, a system of intervals with an interval of maximum length Id = [Ld, Rd]. Which intervals will be included in the resulting system, delimited by Iu and Id, is determined by the form of the so-called poly-interval estimate (PIE), - a curvilinear trapezium containing all the intervals included in their system. To specify the chances of realization of the intervals forming the system, a random variable β is inserted. It is placed on the ordinate axis of the two-dimensional plane and has a chance distribution density f1(β). The value of β serves as a label for the intervals included in their sys- tem. At each of the intervals labeled β, implementation chances of possible point real- izations x, placed on the abscissa axis of a two-dimensional plane, are described by a conditional distribution function with density f2(x|β). Thus GIE is PIE with a given on the last density of the joint distribution function f(β, x) = f1(β)f2(x|β). Hence the GIE formalism allows experts to distinguish mono intervals included in such PIE according to the degree of entry into their system not only by defining the form of the PIE, - certain analogue of the membership function of the apparatus of fuzzy sets, but also by defining the distribution of chances of their inclusion in the GIE. Note that distribution of the chances of realizing the values of the parameters on mono intervals that form the PIE can be arbitrary, not only uniform. We will further assume that the sides of the PIE are straightforward, estimates are normalized so that 0 < β < 1, the label β = 0 corresponds to the interval [Ld = L, Rd = =R], β = 1 to the interval [Lu, Ru] and Ld < Lu < Ru < Rd. Such configurations most often arise when expert knowledges about the parameters of the solved problems are presented by GIE. Quite often triangular PIEs are used in practice. It corresponds to the situation when the initial point estimate T is replaced by the interval system. As above triangu- lar PIE is defined by a triple of corner points L, T, R and L < T < R. Let, for simplicity, the distributions of chances on the coordinate axes of the PIE are uniform. Then, integrating joint distribution function over all β on the PIE of a triangular form, we obtain on [L, R] – on the interval with the label β = 0 - density of the marginal distribution function f(x), specifically density of the generalized uniform distribution (GUD). The GUD on [L, R] is a probabilistic mixture of uniform distribu- tions f2(x|β) with the mixing function f1(β), which is also uniform. The GUD proper- ties for trapezoidal and, as a special case, for triangular PIE were studied in [5]. Using the results obtained there, we have for the density f(x) of GUD on the trian- gular PIE: for L < x < T f(x) = fl(x), where fl(x) = ln[(T – L)/(T – x)]/(R – L) is the left branch; for T < x < R f(x) = fr(x), fr(x) = ln[(R – T)/(x – T)]/(R – L) is its right branch of the distribution density of GUD. Doing in the usual way, for the mathematical expectation EGU of GUD we get: EGU = (L + 2T + R)/4. For the case of symmetric PIE, when T = (L + R)/2, EGU = EU. It can be seen that EGU > T for T < EU otherwise for EGU < T. With this in mind we obtain for T < EU (and, therefore, for EGU > T) R S I  S Ir   ( x  EGU ) f r ( x)dx. (9) EGU and for T > EU (and, therefore, for EGU < T) E GU S I  S Il   ( EGU  x) f l ( x)dx. (10) L Integrating, we get for T < EU R T R  EGU ( EGU  T )2 ln  [ R  EGU  2( EGU  T )] EGU  T 2 SI  , (11) 2( R  L) and for T > EU T L E L (T  EGU )2 ln  GU [ EGU  L  2(T  EGU )] T  EGU 2 SI  . (12) 2( R  L) It can be shown that SI(T) as a function of the upper corner point T is convex down- ward, monotonously decreases on the interval [L, (L + R)/2] and monotonously in- creases on the interval [(L + R)/2, R]. The function is symmetric about the vertical axis T = (L + R)/2 = EU, its minimum SImin is reached at the point T = EU, and the max- imum SImax at the points T = L and T = R: SImin = (R – L)/16, SImax = SI(L) = SI(R) = (R – L)(ln4+3/2)/32. (13) We draw attention to the fact that SImin coincides with mean semi-deviation SIU for a uniform distribution on the interval estimate of the maximum length in their PIE sys- tem. It is useful for experts, during choosing the chances distribution functions to de- scribe their knowledge of interval estimates of quality indicators, to take in their minds the following. The transition from uniform distributions to other, for example, GUD, means having more knowledge about the object. This leads to a decrease in the risk indicator, moreover SIma < SIU. The choice of an upper corner point T for PIE, which equals to mean of the corresponding uniform distribution EU, results in the lowest value of the risk indicator. Deviations of the values of T from EU in both direc- tions lead to an increase in the risk indicator. However, these deviations from the standpoint of comparing alternatives are not equivalent. Deviations of T to the right of the EU lead to an increase in the mean EGU, an indicator of preference in the mean-risk approach, and to the left of the EU to a decrease in EGU. Now, we present the final minimum Kmin and maximum Kmax values of the coeffi- cients at the length parameter R – L in the risk indicator SI for the cases considered here. This information may be useful to experts during working with interval esti- mates. Generalized uniform distributions: Kmin = 0.062; Kmax = 0.09; one-numeric estimates for triangular membership functions: for the first method of defuzzification Kmin = Kmax = 0.062; for the defuzzification using the center of gravity method Kmin = =Kmax = 0.041. Thus, in the framework of the “mean-risk” method the GIE approach leads to more cautious risk estimating in comparison with the “fuzzy” approach. 5. Conclusion In real problems it is necessary to be able to compare alternatives, the quality indica- tors of which are presented and as point estimates (deterministic case) and as mono- intervals and as poly-intervals. Numerous methods have been proposed for compari- son of fuzzy alternatives. Their advantages and disadvantages are analyzed in [16]. However, for comparability of the comparison results, it is advisable to use methods suitable for alternatives of all the above types. The “mean-risk” method and the col- lective risk estimating method are such methods. The lack of methods for calculating one-numeric estimates for the criteria of the methods hindered their applicability for poly-interval alternatives. New methods of constructing one-numeric estimates for interval quantities of fuzzy objects, which are analogs of such characteristics of probability theory as math- ematical expectations, variance, mean semideviation and others, are proposed in the paper. Namely it permits to use “mean-risk” method for comparing poly interval al- ternatives represented as fuzzy alternatives. So far, there has been no justification for doing this. The method is also extended to the case of generalized interval estimates, a new direction in the presentation of knowledge and objects of comparison in the form of poly interval alternatives. The advantage of the “mean-risk” method consists in the possibility of calculating for each individual IA both main criterion indicators that are necessary for evaluating such objects, namely, indicator of the preference of an alternative and indicator of concomitant risk (in particular, mean semideviation). However, the dependence of risk on the context, i.e. on the fact that there are other alternatives in their compared group and that they effect on the magnitude of the risk, is not taken into account. It is a disadvantage of the method. It should be mentioned another drawback inherent in all methods of the class of the downside risk with the calculation of indicators that take into account only the left “tail” of the distribution of chances. Specifically, com- parison with the chances of obtaining benefits (right “tail), that is taking into account the risk of possible loss of profits, is not done. The disadvantage of the “mean-risk” method, associated with the need to take into account collective effects in the group of compared IAs, is overcome in the method of “collective risk” estimating. However, this method is also not without flaws. Only the relative effectiveness of an IA is estimated with its using. In this case, the alternative, recognized as effective in such a comparison, may in itself be ineffective (unprofita- ble). It cannot be if the “mean– risk” method is used. Thus, at the present there are no approaches or methods for solving the problems of choosing the best interval alternatives that are superior to others in terms of the quali- ty of recommendations for decision-makers. Each method and approach has its ad- vantages and disadvantages, different interpretations of risk and methods for calculat- ing their indicators complement each other. In this regard, it seems reasonable to at- tempt to create a comprehensive method for evaluating interval alternatives, combin- ing the merits of various particular methods. At the first stage of applying the com- prehensive method the “mean-risk” method is used to select acceptable IAs, but IAs are considered separately from others in their compared group. At the second stage effective IAs are choose on base of collective risk estimating method in the frame- work of such already narrowed subgroup of IAs. To implement such comprehensive method for poly-interval objects, the method of “collective risk” estimating should be developed for them. It will be done later. The paper is partially supported by Russian Foundation for Basic Research (pro- jects No. 16-29-12864, 17-07-00512, 17-29-07021, 18-07-00280). References 1. Fishburn, P.C.: Mean-risk analysis with risk associated with below-target returns. Ameri- can Economic Review, V. 67, pp. 116–126 (1977) 2. Shepelev, G.: Risk behaviour in a set of interval alternatives. International Journal ”Infor- mation models and analyses”, V. 4., pp. 303 – 323 (2015) 3. Diligensky, N., Dymova, N., Sevastiyanov, P.: Nechetkoe modelirovanie i mnogokriterial- naya optimizatsiya proizvodstvennykh system v usloviyakh neopredelennosti: tekhnologi- ya, ekonomika, ekologiya [Fuzzy modeling and multi-criteria optimization of production systems under uncertainty: technology, economy, ecology], Moscow: Engineering-1 Publs. 397 p. (in Russian) (2004) 4. Podinovsky, V.: Chislovye mery riska kak kriterii vybora pri veroyatnostnoi neopredelen- nosti [Numerical risk measures as selection criteria for probabilistic uncertainty]. In: Is- kusstvenny intellekt i prinyatie resheny [Artificial intelligence and decision-making] 2: 60–74 (in Russian) (2015) 5. Shepelyov, G., Sternin, M.: Methods for comparison of alternatives described by interval estimations. International Journal of Business Continuity and Risk Management, V. 2., Is- sue 1, pp. 56–69 (2011) 6. Roy, A.D.: Safety first and the holding of assets. In: Econometrica, V. 20(3), pp. 431–449 (1952) 7. Bawa, V. S.: Optimal Rules For Ordering Uncertain Prospects. In: Journal of Financial Economics, V 2(1), pp. 95–121 (1975) 8. Sortino, F.A., Meer, R.V.: Downside Risk. In: Journal of Portfolio Management, V. 17(4), pp. 27–31 (1991) 9. Nawrocki, D. A.: Brief History of Downside Risk Measures. In: The Journal of Investing, V. 8(3), pp. 9–25 (1999) 10. Ogryczak, W., Ruszczyński, A.: From stochastic dominance to mean-risk models: semide- viations as risk measures. In: European journal of operational research, V. 116, pp. 33–50 (1999) 11. Grechuk, B., Molyboha, A., Zabarankin, M.: Mean-deviation analysis in the theory of choice. In: Risk analysis, V. 32, pp. 1277–1292 (2012) 12. Shepelev, G.: Decision-making in groups of interval alternatives. In: International journal “Information theories and applications”, V. 23 (4), pp. 303 – 320 (2016) 13. Dubois, D., Prade, H.: Fuzzy Sets and Systems, Academic Press, New York (1988) 14. Sternin, M., Shepelev, G.: Generalized interval expert estimates in decision making. In: Doklady Mathematics,V. 81(3), pp. 1–2 (2010) 15. Dubois, D., Prade, H.: Theorie des possibilities, Paris: Masson (1988) 16. Dorohonceanu B., Marin B.: A simple method for comparing fuzzy numbers. Rutgers University, Piscataway, CAIP Center (2002)