Method for Shortening Dimensionality of Qualitative Attribute Space Alexey B. Petrovskya,b,c,d a Federal Research Center “Computer Science and Control” Russian Academy of Sciences, Moscow, Russia b V.G. Shukhov Belgorod State Technological University, Belgorod, Russia c Belgorod State National Research University, Belgorod, Russia d Volgograd State Technical University, Volgograd, Russia Abstract The paper presents a new method SOCRATES (ShOrtening CRiteria and ATtributES) for reducing the dimensionality of attribute space. In the method, a lot of initial numerical and/or verbal characteristics of objects are aggregated into a single integral index or several composite indicators with small scales of qualitative estimates. Aggregation of indicators includes various methods for a transformation of attributes and their scales. Multi-attribute objects are represented as multisets of object properties. Reducing the dimensionality of attribute space allows us to simplify the solution of applied problems, in particular, problems of multiple criteria choice, and explain the obtained results. An illustrative example is given. Keywords 1 Attribute space, dimensionality reduction, attributes’ aggregation, composite indicator, integral index, multi-attribute object, multiset, multiple criteria choice 1. Introduction The tasks of a strategic and unique choice, in which there are very few objects, and the number of features characterizing their properties is large (tens or hundreds), are among the most difficult. Examples of such objects are places for the construction of an airport or power plant, routes of a gas or oil pipeline, schemes of a transportation network, configurations of a complex technical system, and the like. For decision-makers (DMs) and experts in real situations, it is very difficult to select the best object, to rank or classify objects that are described by numerous attributes, because, as a rule, many objects are formally not comparable in their characteristics. Additional difficulties arise in the case of ill-structured problems combining quantitative and qualitative dependencies, the modeling of which is either impossible in principle or very hard. The known decision-making methods [2-7] are poorly suitable for solving problems of multi-criteria selection of large dimension because they require a lot of labor and time to obtain and process large amounts of data about objects, a DM preferences and/or expert knowledge. The following approaches are possible that facilitate the choice in a large space of attributes and reduce information loss: the use of psychologically correct operations for obtaining information from decision-makers and experts; the reduction of dimension of attribute space. It has been experimentally established that it is easier for a person, due to the peculiarities of his physical memory, to operate with small amounts of data, to compare objects according to a small number of indicators. To do this, it is enough to describe objects with three-seven indicators. At the same time, a person makes fewer mistakes when indicators have not numerical, but verbal scales [4, 5]. Shortening the dimension of attribute space by reducing the number of variables simplifies the solution of problems of individual Russian Advances in Artificial Intelligence: selected contributions to the Russian Conference on Artificial intelligence (RCAI 2020), October 10-16, 2020, Moscow, Russia EMAIL: pab@isa.ru ORCID: 0000-0002-5071-0161 © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) and group multi-criteria choice. Almost all the applied methods for shortening the space dimension work with numerical attributes [1, 2, 11]. Procedures for shortening the dimension of spaces of verbal attributes are described in [10]. This work describes a new method SOCRATES (ShOrtening CRiteria and ATtributES), in which a large number of initial characteristics of objects are aggregated into a single integral index or several indicators with small scales of verbal estimates. Representation of multi-attributes objects by multisets and aggregation of attributes can significantly diminish the complexity of solving the original problem of multi-criteria choice and explain reasonably the results. 2. Representation and comparison of multi-attribute objects A multiset or a set with repetitions is a convenient mathematical model for representing and comparing objects, which are defined by many numerical and/or verbal attributes and are presented in several exemplars (versions, copies) that differ in the values of their characteristics [7-10]. This model allows us to take into account simultaneously heterogeneous attributes, possible combinations of attribute values, and the presence of different exemplars of objects. Let objects O1,...,Oq exist in single copies and are described by attributes K1,…,Kn with numerical and/or verbal rating scales. If each of the attributes K1,…,Kn has the same rating scale X = {x1,...,xh}, then we associate the object Op, p = 1,…,q with a multiset of estimates Ap = {kAp(x1)◦x1,...,kAp(xh)◦xh} (1) over the set X = {x1,...,xh} of scale gradations. Here, the value of the multiplicity function kAp(xe) shows how many times the estimate xe∈X, e = 1,…,h is present in the description of the object Op. If each attribute Ki has its own rating scale Xi = {xi1,…,xihi}, i = 1,…,n, we introduce a single general scale (hyperscale) of attributes – the set X = X1...Xn = {x11,…,x1h1;…; xn1,…,xnhn}, which consists of n groups of attributes and combines all the estimate gradations on the scales of all attributes. Then the object Op will correspond to a multiset of estimates Ap = {kAp(x11)◦x11,…,kAp(x1h1)◦x1h1;…; kAp(xn1)◦xn1,…,kAp(xnhn)◦xnhn} (2) Here, the value of the multiplicity function kAp(xiei) shows how many times the estimate xiei∈Xi, ei = 1,…,hi by the attribute Ki is present in the description of the object Op. The expression (2) is easy to write in the “usual” form (1), if in the set X = {x11,…,x1h1;…; xn1,…,xnhn} make the change of variables: x11 = x1,…, x1h1 = xh1, x21 = xh1+1,…, x2h2 = xh1+h2,…, xnhn = xh, h = h1+...+hn. Despite the seemingly cumbersome presentation of multi-attribute objects by multisets, such recording forms are very convenient when performing operations under objects, since calculations are performed parallel and simultaneously for all elements of all multisets. The situation is complicated when the object Op is present in several copies Op, p = 1,…,q, s = 1,…,t, which differ in the values of the attributes K1,…,Kn. Different versions of the description of the object Op arise, for example, when the object is evaluated by t experts according to many criteria K1,…,Kn, or the characteristics of the object are calculated t times by several methods K1,…,Kn, or measured t times by several tools K1,…,Kn. A variety of operations on multisets provides the ability to group multi-attribute objects in different ways [9, 10]. A group of objects can be formed by defining the multiset J that represents the group by the sum J = ∑s As, kJ(xe) = ∑s kAs(xe), by the union J = s As, kJ(xe) = maxs kAs(xe), by the intersection J = s As, kJ(xe) = mins kAs(xe) of multisets As that describe the grouping objects, or by one of the linear combinations of operations on multisets As such as J = ∑s csAs, J = s csAs, J = s csAs, cs > 0 is an integer. When we add multisets, all properties (all values of all attributes), which are available to individual objects in the group, are aggregated. When we combine or intersect multisets, the best properties (maximum values of all attributes) or, correspondingly, the worst properties (minimum values of all attributes) of the grouping objects are strengthened. If there are several versions of the object Op, then this object is represented by a group of all its copies Op, p = 1,…,q, s = 1,…,t. We associate the object Op with the multiset Ap = {kAp(x1)◦x1,..., kAp(xh)◦xh} of the form (1), (2), and the version Op with the multiset Ap = {kAp(x1)◦x1,..., kAp(xh)◦xh} over the set of estimates X = {x1,...,xh} or X = {x11,…,x1h1;…; xn1,…,xnhn}. We will form the multiset Ap as a weighted sum of multisets describing versions of the object: Ap = с<1>Ap<1> +…+ сAp, where the multiplicity function of the multiset Ap is calculated by the rule kAp(xe) = ∑s сkAp(xe), and the coefficient с characterizes the significance (expert competence, measurement accuracy) of the exemplar Op. 3. Reduction of attribute scales The reduction of dimension of the object description is a decrease of number of the indicators, that characterize the properties, state or functioning of objects, using some transformations of the initial data, during which the set of initial attributes K1,…,Kn is aggregated into smaller sets of new intermediate attributes L1,…,Lm,… and final attributes N1,…,Nl. Transformations of characteristics can be formally written as K1,…,Kn → L1,…,Lm →… →N1,…, Nl. (3) The initial attribute Ki has the scale Xi = {xi1,…,xihi}, i = 1,…,n, the intermediate attribute Lj has the scale Yj = {yj1,…,yjgj}, j = 1,…,m, the final attribute Nk has the scale Zk = {zk1,…,zkfk}, k = 1,…,l, l < m < n. Decreasing the dimension of attribute space is an informal multi-stage procedure based on the knowledge, experience and intuition of a DM/expert, who forms the rules for the conversion of attributes, establishes the structure, number, dimension and sense of new indicators. In cases where multi-attribute objects are represented by vectors/tuples, the task (3) of reducing the dimension of attribute space has the form: X1×…×Xn → Y1×…×Ym →…→ Z1×…×Zl. (4) Then the dimension of the corresponding attribute space is defined as the power of the direct product of numerical or verbal gradations of attribute scales that are components of vectors/tuples. In the book [10], the problem (4) is considered as a multi-criteria classification problem, where the sets of estimates of the initial attributes are the classified objects, and the grades of the composite indicator scale are the classes of solutions [5, 7]. In cases where multi-attribute objects are represented by multisets, the task (3) of reducing the dimension of attribute space has the form: X1…Xn → Y1…Ym →…→ Z1…Zl. (5) Then the dimension of the corresponding attribute space is defined as the power of the hyperscale – the union of numerical or verbal gradations of attribute scales that are elements of multisets. The method SOCRATES (ShOrtening CRiteria and ATtributES) allows us to reduce the dimension of the description of multi-attribute objects that are presented in several different versions and are defined by multisets of numerical and/or verbal characteristics. The method uses two principal ways of data transformations: a reduction of attribute scale and an aggregation of attributes. The reduction of attribute scale is a relatively simple transformation of attribute space and is aimed at reducing the number of gradations on the attribute scale. For this, several values of some object characteristic are combined into one new gradation of the same characteristic. The transition from the original characteristic scales to scales with the reduced number of grades is the transformation (5) into the form X1…Xn → Q1…Qn, (6) where Xi = {xi1,…,xihi} is the original scale, Qi = {qi1,…,qidi} is the shortened scale of the attribute Ki, |Qi| = di < hi = |Xi|, i = 1,…,n. When forming (6) the shortened scales of attributes, it is desirable that they consist of a small number (2-4) of gradations, which have certain content for a DM/expert. The representation of multi-attribute objects is transformed as follows. In the attribute space K1,…,Kn, let the object Op, p = 1,…,q be defined by the multiset Ap (2) over the set X1…Xn of gradations of the original scales. Considering the properties of operations on multisets [9, 10], we rewrite expression (2) as multiset sums: Ap = Ap1 +…+ Apn = {kAp(x11)◦x11,…,kAp(x1h1)◦x1h1} +…+ {kAp(xn1)◦xn1,…,kAp(xnhn)◦xnhn} = = ∑he11 =1{ kAp(x1e1)◦x1e1} +… + ∑henn =1{ kAp(xnen)◦xnen}. (7) When reducing the attribute scale, several gradations xiea, xieb,…, xiec of the original scale Xi = {xi1,…,xihi} of the attribute Ki are combined into a single gradation qioi of the shortened scale Qi = {qi1,…,qidi}. In the reduced space of attributes K1,…,Kn with rating scales Q1,…,Qn, the object Op will correspond to a multiset Bp = {kBp(q11)◦q11,…,kBp(q1d1)◦q1d1;…; kBp(qn1)◦qn1,…,kBp(qndn)◦qndn} (8) over the set Q1…Qn of gradations of the shortened scales. The multiset Bp (8) can also be written in the equivalent form: Bp = Bp1 +…+ Bpn = {kBp(q11)◦q11,…,kBp(q1d1)◦q1d1} +…+ {kBp(qn1)◦qn1,…,kBp(qndn)◦qndn} = = ∑od11 =1{ kBp(q1o1)◦q1o1} +… + ∑odnn =1{ kBp(qnon)◦qnon}. (9) The multiplicity of the element qioi, oi = 1,…,di of the multiset Bp (8) or (9), which corresponds to the gradation qioi of the shortened scale Qi = {qi1,…,qidi}, is determined by the rule: kBp(qioi) = kAp(xiea) + kAp(xieb) +…+ kAp(xiec), (10) where the multiplicities of the elements xiea, xieb,…, xiec of the multiset Ap (2) or (7), which correspond to the combined gradations of the original scale Xi = {xi1,…,xihi} of the attribute Ki, are summarized. 4. Aggregation of attributes The aggregation of attributes is a more complex transformation of attribute space and is aimed at reducing the number of attributes. For this, several attributes La, Lb,…,Lc are combined into a single new attribute Nk, which we will call a composite indicator or a composite criterion. The aggregation of several attributes into a composite indicator is the transformation (5) that takes the form YaYb…Yc → Zk, (11) where Yj = {yj1,…,yjgj} is the scale of the original attribute Lj, j = a,b,…, c, Zk = {zk1,…,zkfk} is the scale of the composite indicator Nk, k = 1,…,l, |Zk| = fk < ga + gb +…+ gc = |YaYb…Yc|. Sets of composite indicators and their scales can be formed by different methods (11) of granulation. This allows us to represent each gradation of the composite indicator scale as a combination of estimate gradations of initial attributes. It is recommended to combine 2-4 original attributes in a composite indicator with a small scale, including 2-4 gradations. In practical tasks, it is convenient to form the scales of the combined attributes and the composite indicator so that they have the same number of gradations, that is ga = gb =…= gc = fk = d, and each gradation of the scale of the composite indicator consists of similar gradations of the combined attribute scales. The representation of multi-attribute objects during the attributes’ aggregation is transformed as follows. In the space of original attributes L1,…,Lm, let the object Op, p = 1,…,q be defined by a multiset Ip = {kIp(y11)◦y11,…,kIp(y1d)◦y1d;…; kIp(ym1)◦ym1,…,kIp(ymd)◦ymd} (12) over the set Y1…Ym of scale gradations, where all scales Yj = {yj1,…,yjd}, j = 1,…,m have the same number d of gradations. Taking into account that the order of elements in a multiset is inconsequential, we rewrite expression (12) into the form of multiset sums: Ip = Ip1 +…+ Ipd = {kIp(y11)◦y11,…,kIp(ym1)◦ym1} +…+ {kIp(y1d)◦y1d,…,kIp(ymd)◦ymd} = = ∑mj=1{ kIp(yj1)◦yj1} +… + ∑mj=1{ kIp(yjd)◦yjd}. (13) In the reduced space of composite indicators N1,…,Nl, the object Op will correspond to a multiset Jp = {kJp(z11)◦z11,…,kJp(z1d)◦z1d;…; kJp(zl1)◦zl1,…,kJp(zld)◦zld} (14) over the set Z1…Zl of scale gradations, where all scales Zk = {zk1,…,zkd}, k = 1,…,l have the same number d of gradations. The multiset Jp (14) can also be rewritten in the equivalent form: Jp = Jp1 +…+ Jpd = {kJp(z11)◦z11,…,kJp(zl1)◦zl1} +…+ {kJp(z1d)◦z1d,…,kJp(zld)◦zld} = = ∑lk =1{ kJp(zk1)◦zk1} +… + ∑lk =1{ kJp(zkd)◦zkd}. (15) The multiplicity of the element zke, e = 1,…,d of the multiset Jp (14) or (15), which corresponds to the gradation zke of the scale Zk of the composite indicator Nk, is determined by the rule: kIp(zke) = kIp(yae) + kIp(ybe) +…+ kIp(yce), (16) where the multiplicities of the elements yae, ybe,…, yce of the multiset Ip (12) or (13), which correspond to the gradations yae, ybe,…, yce of the scales Ya, Yb,…, Ye of the combined attributes La, Lb,…, Lc. An aggregation of attributes is carried out in stages, step by step. At each stage, it is determined which initial attributes should be combined into composite indicators, and which should be considered as independent final indicators. Verbal scales of composite indicators characterize desirable new properties of the compared objects and have a specific semantic content for a decision maker/expert. Consistently combining the attributes, a DM/expert designs acceptable intermediate and final indicators. The tree of attribute aggregation is built from the uniformed blocks that are selected by a DM/expert, and, in fact, is a form of semantic interpretation and granulation of a DM preferences and/or expert knowledge. In practical situations, it is recommended to design several different schemes of combining indicators that include the procedures for reducing the attribute scale and aggregating attributes. Thus, the influence of each particular scheme is reduced and the validity of the results is increased. Depending on the specifics of the practical problem being solved, the last level of the attribute aggregation tree may consist of several final indicators that implement the idea of multi-criteria choice, or it may be a single integral index that implements the idea of holistic choice [7]. 5. Illustrative example Consider an illustrative example taken from [9, 10]. There are ten objects O1,...,O10, which are described by eight attributes K1,…,K8 with five-point rating scales Xi = {xi1, xi2, xi3, xi4, xi5}, i = 1,…,8. Let the objects be pupils, and the attributes be estimates in studied subjects: K1, Mathematics; K2, Physics; K3, Chemistry; K4, Biology; K5, Geography; K6, History; K7, Literature; K8, Foreign language. Graduations of rating scales can be numerical or verbal and mean: xi1 is 1/very bad; xi2 is 2/bad; xi3 is 3/satisfactory; xi4 is 4/good; xi5 is 5/excellent. Suppose that estimates are given twice a year for each half-year (semester). Then each object is present in two versions (copies) that differ from each other. Present the semi-annual estimates of the pupil Op, p = 1,…,10 (two versions Op<1>, Op<2> of the object Op) by the multisets Ap<1>, Ap<2> of the form (2) over the set X = X1...X8 of gradations of the attribute scales K1,…,K8. Let us define the annual estimates of the pupil Op by the multiset Ap, which we form as the weighted sum of the multisets describing the versions of the object: Ap = с<1>Ap<1> + с<2>Ap<2>. Assuming that the semi- annual estimates are equally significant: с<1> = с<2> = 1, we obtain a multiset Ap = {kAp(x11)◦x11,…,kAp(x15)◦x15;…; kAp(x81)◦x81,…,kAp(x85)◦x85}, (17) the element multiplicities of which form the rows of the matrix H ‘Object–Attributes’ (Table 1). For example, the annual estimates of the pupil O1 correspond to a multiset A1 = {0◦x11, 0◦x12, 0◦x13, 1◦x14, 1◦x15; 0◦x21, 0◦x22, 0◦x23, 0◦x24, 2◦x25; 0◦x31, 0◦x32, 0◦x33, 1◦x34, 1◦x35; 0◦x41, 0◦x42, 0◦x43, 0◦x44, 2◦x45; 0◦x51, 0◦x52, 0◦x53, 2◦x54, 0◦x55; 0◦x61, 0◦x62, 0◦x63, 1◦x64, 1◦x65; 0◦x71, 0◦x72, 0◦x73, 2◦x74, 0◦x75; 0◦x81, 0◦x82, 0◦x83, 0◦x84, 2◦x85}. This form of recording shows that, over the year, the pupil O1 received one estimate “good” and one estimate “excellent” in mathematics, chemistry, history; two estimates “excellent” in physics, biology, geography, literature, and a foreign language. The pupil O1 did not receive other estimates. Table 1 Matrix H ‘Objects–Attributes’ (initial scales of attributes) O\X x11 x12 x13 x14 x15 x21 x22 x23 x24 x25 x11 x32 x33 x34 x35 x41 x42 x43 x44 x45 A1 0 0 0 1 1 0 0 0 0 2 0 0 0 1 1 0 0 0 0 2 A2 0 0 1 1 0 1 1 0 0 0 1 1 0 0 0 2 0 0 0 0 A3 2 0 0 0 0 1 1 0 0 0 0 0 2 0 0 2 0 0 0 0 A4 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0 0 1 1 A5 0 0 0 1 1 0 0 0 1 1 0 0 1 1 0 0 0 0 2 0 A6 0 0 0 1 1 0 0 0 0 2 0 0 0 2 0 0 0 0 2 0 A7 0 0 1 1 0 1 1 0 0 0 1 1 0 0 0 0 0 1 1 0 A8 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 0 1 1 0 0 A9 0 0 1 1 0 0 1 1 0 0 0 1 1 0 0 1 1 0 0 0 A10 0 0 1 0 1 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 O\X x51 x52 x53 x54 x55 x61 x62 x63 x64 x65 x71 x72 x73 x74 x75 x81 x82 x83 x84 x85 A1 0 0 0 2 0 0 0 0 1 1 0 0 0 2 0 0 0 0 0 2 A2 0 0 1 1 0 0 1 1 0 0 0 1 1 0 0 0 2 0 0 0 A3 0 0 0 1 1 1 1 0 0 0 2 0 0 0 0 0 0 1 1 0 A4 0 0 0 2 0 0 0 0 0 2 0 0 1 1 0 0 0 0 1 1 A5 0 0 0 2 0 0 0 0 1 1 0 0 0 1 1 0 0 0 2 0 A6 0 0 0 2 0 0 0 0 1 1 0 0 0 0 2 0 0 0 1 1 A7 0 1 1 0 0 0 0 1 1 0 1 1 0 0 0 0 1 1 0 0 A8 0 0 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 1 1 0 A9 0 1 1 0 0 0 0 2 0 0 0 1 1 0 0 0 0 2 0 0 A10 0 1 1 0 0 0 0 0 1 1 0 1 0 0 1 0 0 0 2 0 The dimension of the attribute space is equal to |X| = 5⋅8 = 40. The total number of object representations by elements of multisets (possible estimates in all studied subjects) is equal to the cardinality of a multiset Ap (17): cardAp = ∑xi∈X kAp(xiei) = 16. Objects and multisets are generally incomparable. Replace the initial five-point rating scales Xi = {xi1, xi2, xi3, xi4, xi5} with the shortened three-point rating scales Qi = {qi0, qi1, qi2}. Here qi0 is 0/high mark, including the estimates xi5 – 5/excellent and xi4 – 4/good; qi1 is 1/middle mark corresponding to the estimate xi3 – 3/satisfactory; qi2 is 2/low mark, including the estimates xi2 – 2/bad and xi1 – 1/very bad. If the initial gradations were ordered by preference, for example, xi5  xi4  xi3  x2  xi1, then the new gradations will also be ordered as qi0  qi1  qi2. When transiting from the five-point rating scales Xi to the three-point rating scales Qi, i = 1,…,8, the object Op will correspond to a multiset Bp = {kBp(q10)◦q10, kBp(q11)◦q11, kBp(q12)◦q12;…; kBp(q80)◦q80, kBp(q81)◦q81, kBp(q82)◦q82} (18) over the set Q = Q1...Q8 of shortened gradations of the attribute scales K1,…,K8. The multiplicities of elements of the multiset Bp (18) form the rows of the matrix H0 ‘Object–Attributes’ (Table 2), which is a reduced matrix H (Table 1), and are determined by the rules (10): kBp(qi0) = kAp(xi5) + kAp(xi4), kBp(qi1) = kAp(xi3), kBp(qi2) = kAp(xi2) + kAp(xi1). In particular, the object O1 is defined by a multiset B1 = {2◦q10, 0◦q11, 0◦q12; 2◦q20, 0◦q21, 0◦q22; 2◦q30, 0◦q31, 0◦q32; 2◦q40, 0◦q41, 0◦q42; 2◦q50, 0◦q51, 0◦q52; 2◦q60, 0◦q61, 0◦q62; 2◦q70, 0◦q71, 0◦q72; 2◦q80, 0◦q81, 0◦q82}. We show that, over a year, the pupil O1 received two high marks (estimates “excellent” and “good”) in all studied subjects: mathematics, physics, chemistry, biology, geography, history, literature, and a foreign language. Table 2 Matrix H0 ‘Objects–Attributes’ (shortened scales of attributes) O\Q q10 q11 q12 q20 q21 q22 q10 q31 q32 q40 q41 q42 q50 q51 q52 q60 q61 q62 q70 q71 q72 q80 q81 q82 B1 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 B2 1 1 0 0 0 2 0 0 2 0 0 2 1 1 0 0 1 1 0 1 1 0 0 2 B3 0 0 2 0 0 2 0 2 0 0 0 2 2 0 0 0 0 2 0 0 2 1 1 0 B4 2 0 0 1 1 0 0 1 1 2 0 0 2 0 0 2 0 0 1 1 0 2 0 0 B5 2 0 0 2 0 0 1 1 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 B6 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 2 0 0 B7 1 1 0 0 0 2 0 0 2 1 1 0 0 1 1 1 1 0 0 0 2 0 1 1 B8 2 0 0 2 0 0 2 0 0 0 1 1 1 1 0 2 0 0 2 0 0 1 1 0 B9 1 1 0 0 1 1 0 1 1 0 0 2 0 1 1 0 2 0 0 1 1 0 2 0 B10 1 1 0 2 0 0 1 1 0 2 0 0 0 1 1 2 0 0 1 0 1 2 0 0 The dimension of the reduced attribute space is equal to |Q| = 3⋅8 = 24, and the total number of object representations is expressed by the cardinality of a multiset Bp (18): cardBp = ∑qi∈Q kB(qiei) = 16. When the attribute scales are reduced, the dimension of the converted space decreases, and the total number of estimates in the studied subjects does not change. Objects and multisets are still largely incomparable. But a work with them is becoming easier. We shall consider the transition from the original scales Xi to the shortened scales qi as the zero scheme for aggregating attributes. To represent objects in reduced attribute spaces, we shall build other collections of indicators with different schemes of aggregating characteristics (Fig. 1). For simplicity, we assume that the scale of any new attribute has three gradations of estimates, as well as the scale Qi. Each gradation of the scale of a composite indicator includes combinations of the same type of gradations on the scales of the initial attributes. K1 K1 K1 K1 L1 L1 L1 L1 K2 K2 K2 K2 M1 M1 K3 K3 K3 K3 L2 L2 L2 L2 K4 K4 K4 K4 N1 N2 K5 K5 K5 K5 L3 L3 L3 L3 K6 K6 K6 K6 M2 M2 K7 K7 K7 K7 L4 L4 L4 L4 K8 K8 K8 K8 a b c d Figure 1: Aggregation of initial characteristics into composite indicators: (a) the first scheme, (b) the second scheme, (c) the third scheme, (d) the fourth scheme. According to the first aggregation scheme (Fig. 1, a), all the initial attributes K1,…,K8 with scales Qi = {qi0, qi1, qi2} are combined into composite indicators, which are considered as final. The attributes K1, Mathematics, and K2, Physics form a composite indicator L1, Physical-Mathematical disciplines: L1 = (K1, K2). The attributes K3, Chemistry, and K4, Biology form a composite indicator L2, Chemical-Biological disciplines: L2 = (K3, K4). The attributes K5, Geography, and K6, History form a composite indicator L3, Socio-Historical disciplines: L3 = (K5, K6). The attributes K7, Literature, and K8, Foreign language form a composite indicator L4, Philological disciplines: L4 = (K7, K8). The composite indicators L1,…,L4 have scales Yj = {yj0, yj1, yj2}, j = 1,2,3,4 with the following verbal gradations: yj0 is 0/high mark, including estimates qa0, qc0; yj1 is 1/middle mark, including estimates qa1, qc1; yj2 is 2/low mark, including the estimates qa2, qc2. Here a = 1, c = 2 for j = 1; a = 3, c = 4 for j = 2; a = 5, c = 6 for j = 3; a = 7, c = 8 for j = 4. Each object Op is represented by a multiset Cp = {kCp(y10)◦y10, kCp(y11)◦y11, kCp(y12)◦y12;…; kCp(y40)◦y40, kCp(y41)◦y41, kCp(y42)◦y42} (19) over the set Y = Y1...Y4 of gradations of the indicator scales L1,…,L4. The multiplicities of elements of the multiset Cp (19) form the rows of the matrix H1 ‘Object–Attributes’ (Table 3) and are determined by rule (16) for forming the scales of composite indicators L1,…,L4 from the scales of attributes K1,…,K8. In particular, the object O1 is defined by a multiset C1 = {4◦y10, 0◦y11, 0◦y12; 4◦y20, 0◦y21, 0◦y22; 4◦y30, 0◦y31, 0◦y32; 4◦y40, 0◦y41, 0◦y42}. We show that, over a year, the pupil O1 received four high marks in physical-mathematical, chemical- biological, socio-historical and philological disciplines. Table 3 Matrix H1 ‘Objects–Attributes’ (the first aggregation scheme) O\Y y10 y11 y12 y20 y21 y22 y10 y31 y32 y40 y41 y42 l(Op) s(Op) p(Op) b(Op) C1 4 0 0 4 0 0 4 0 0 4 0 0 0,000 48 1-2 25,5 C2 1 1 2 0 0 4 1 2 1 0 1 3 0,700 24 9 1 C3 0 0 4 0 2 2 2 0 2 1 1 2 0,684 25 8 4 C4 3 1 0 2 1 1 4 0 0 3 1 0 0,211 43 4-5 16,5 C5 4 0 0 3 1 0 4 0 0 4 0 0 0,059 47 3 21 C6 4 0 0 4 0 0 4 0 0 4 0 0 0,000 48 1-2 25,5 C7 1 1 2 1 1 2 1 2 1 0 1 3 0,565 27 7 7,5 C8 4 0 0 2 1 1 3 1 0 3 1 0 0,211 43 4-5 16,5 C9 1 2 1 0 1 3 0 3 1 0 3 1 0,556 27 10 5,5 C10 3 1 0 3 1 0 2 1 1 3 0 1 0,253 41 6 12 According to the second aggregation scheme (Fig. 1, b), the first stage is the same as in the first scheme. In the next stage, the indicators L1, Physical-Mathematical disciplines and L2, Chemical- Biological disciplines form a composite indicator M1, Natural disciplines: M1 = (L1, L2). The indicators L3, Socio-Historical disciplines and L4, Philological disciplines form a composite indicator M2, Humanitarian disciplines: M2 = (L3, L4). The composite indicators M1, M2 are considered as final, which have scales Ur = {ur0, ur1, ur2}, r = 1,2 with the following verbal gradations: ur0 is 0/high mark, including estimates yb0, yd0; ur1 is 1/middle mark, including estimates yb1, yd1; ur2 is2/low mark, including the estimates yb2, yd2. Here b = 1, d = 2 for r = 1; b = 3, d = 4 for r = 2. Each object Op is represented by a multiset Dp = {kDp(u10)◦u10, kDp(u11)◦u11, kDp(u12)◦u12; kDp(u20)◦u20, kDp(u21)◦u21, kDp(u22)◦u22} (20) over the set U = U1U2 of gradations of the indicator scales M1, M2. The multiplicities of elements of the multiset Dp (20) form the rows of the matrix H2 ‘Object–Attributes’ (Table 4) and are determined by rule (16) for forming the scales of composite indicators M1, M2 from the scales of indicators L1,…,L4. In particular, the object O1 is defined by a multiset D1 = {8◦u10, 0◦u11, 0◦u12; 8◦u20, 0◦u21, 0◦u22}. This shows that, over a year, the pupil O1 received eight high marks in natural and humanitarian disciplines. According to the third aggregation scheme (Fig. 1, c), the first and second stages are the same as in the second scheme. In the next stage, the indicators M1, Natural disciplines and M2, Humanitarian disciplines form a final integral index N1, Academic score: N1 = (M1, M2), which have a scale Z1 = {z10, z11, z12} with the following verbal gradations: z10 is 0/high mark, including estimates u10, u20; z11 is 1/middle mark, including estimates u11, u21; z12 is 2/low mark, including the estimates u12, u22. Each object Op is represented by a multiset Ep = {kEp(z10)◦z10, kEp(z11)◦z11, kEp(z12)◦z12} (21) over the set Z1 of gradations of the indicator scale N1. The multiplicities of elements of the multiset Ep (21) form the rows of the matrix H3 ‘Object–Attributes’ (Table 4) and are determined by rule (16) for forming the scale of composite indicator N1 from the scales of indicators M1, M2. In particular, the object O1 is defined by a multiset E1 = {16◦z10, 0◦z11, 0◦z12}. We show that, over a year, the pupil O1 received sixteen high marks in all studied disciplines. According to the fourth aggregation scheme (Fig. 1, d), the first and second stages are the same as in the first scheme. In the next stage, the indicators L1, Physical-Mathematical disciplines, L2, Chemical-Biological disciplines, L3, Socio-Historical disciplines, and L4, Philological disciplines form a final integral index N2, Academic score: N2 = (L1, L2, L3, L4), which have a scale Z2 = {z20, z21, z22} with the following verbal gradations: z20 is 0/high mark, including estimates y10, y20, y30, y40; z21 is 1/middle mark, including estimates y11, y21, y31, y41; z22 is 2/low mark, including the estimates y12, y22, y32, y42. Each object Op is represented by a multiset Fp = {kFp(z20)◦z20, kFp(z21)◦z21, kFp(z22)◦z22} (22) over the set Z2 of gradations of the indicator scale N2. The multiplicities of elements of the multiset Fp (22) form the rows of the matrix H4 ‘Object–Attributes’ (Table 4) and are determined by rule (16) for forming the scale of composite indicator N2 from the scales of indicators L1, L2, L3, L4. In particular, the object O1 is defined by a multiset F1 = {16◦z20, 0◦z21, 0◦z22}. We show that, over a year, the pupil O1 received sixteen high marks in all studied disciplines. Table 4 Matrices ‘Objects–Attributes’ H2 (the second H3 (the third H4 (the fourth aggregation scheme) aggregation scheme) aggregation scheme) O\U u10 u11 u12 u20 u21 u22 O\Z z10 z11 z12 O\Z z20 z21 z22 D1 8 0 0 8 0 0 E1 16 0 0 F1 16 0 0 D2 1 1 6 1 3 4 E2 2 4 10 F2 2 4 10 D3 0 2 6 3 1 4 E3 3 3 10 F3 3 3 10 D4 5 2 1 7 1 0 E4 12 3 1 F4 12 3 1 D5 7 1 0 8 0 0 E5 15 1 0 F5 15 1 0 D6 8 0 0 8 0 0 E6 16 0 0 F6 16 0 0 D7 2 2 4 1 3 4 E7 3 5 8 F7 3 5 8 D8 6 1 1 6 2 2 E8 12 3 1 F8 12 3 1 D9 1 3 4 0 6 2 E9 1 9 6 F9 1 9 6 D10 6 2 0 5 1 2 E10 11 3 2 F10 11 3 2 The indicators can also be aggregated in another way. For example, the attributes K1, Mathematics, K2, Physics, K3, Chemistry, K4, Biology form a composite indicator M3. Natural disciplines: M3 = (K1, K2, K3, K4). The attributes K5, Geography, K6, History, K7, Literature, K8, Foreign language form a composite indicator M4. Humanitarian disciplines: M4 = (K5, K6, K7, K8). The composite indicators M3 and M4 can either be considered as final indicators, or combined further into an integral index N3, Academic score: N3 = (M3, M4). Other options for aggregating indicators are also possible. When forming aggregation schemes, it is advisable to combine the initial attributes into a composite indicator so that it makes sense, and the gradations of its scale consisted of a small number of combinations of the initial gradations. So, in the transition from the initial data to aggregated indicators, the dimension of the transformed spaces decreases sequentially from 40 to 24, 12, 6, 3, but the total number of estimates in all studied subjects, that is expressed by the cardinality of the multisets Ap (17), Bp (18), Cp (19), Dp (20), Ep (21), Fp (22), does not change. We can assume that the five constructed schemes for aggregating characteristics are the judgments of five independent experts. In this case, any multi-criteria choice problem becomes a collective choice problem, which is solved in various reduced spaces of attributes, using several different methods in each space. This ensures greater validity of the final results. Let us illustrate the suggested technique, considering rankings of objects O1,...,O10 obtained with the multi-method technology PAKS-M (Progressive Aggregation of the Classified Situations by many Methods) for multi-criteria choice in the attribute space of large dimension [10]. Firstly, for each aggregation scheme, we built collective rankings of objects using three methods of group choice: ARAMIS, weighted sum of estimates, lexicographic ordering [7, 8]. The ARAMIS (Aggregation and Ranking of Alternatives close to Multi-attribute Ideal Situations) method allows to rank multi-attribute objects, evaluated by several experts upon many quantitative and/or qualitative criteria K1,…,Kn, without building individual rankings of objects. The objects are ordered in the Petrovsky metric space of multisets according to the value of the indicator l(Op) of the relative proximity of the object Op to the best (possibly hypothetical) object O+, to which all experts gave the highest grades by all criteria. The method of weighted sums of estimates allows us to rank multi-attribute objects by the values of their value functions. The value of the object Op is given by the sum s(Op) of the products of numbers of the scale gradations by the weight of gradation. In the example considered, the high gradation was assigned the weight 3, the middle gradation – the weight 2, the low gradation – the weight 1. The method of lexicographic ordering allows us to rank multi-attribute objects by the total number of corresponding estimate gradations. The place p(Op) of the object Op in the ranking is determined firstly by the number of the high marks; then by the number of the middle marks if several objects have the same number of high marks; further by the number of the low marks if several objects have the same number of middle marks, etc. For all five schemes for aggregating characteristics, the results of data processing by each of the above methods turned out to be the same. They are shown in Table 3. In other words, the judgments of all five independent experts based on any method of choice coincided. This was a consequence of the additivity of rules (10) and (16) for the transformation of attribute scales. The collective rankings of objects, obtained according to any of the five schemes using the methods of ARAMIS RΑgr, weighted sum of estimates RΣ gr, lexicographic ordering RΛgr, look like this: RΑgr ⇔ [O1, O6  O5]  [O4, O8  O10]  [(O9  O7)  O3  O2], RΣ gr ⇔ [O1, O6  O5]  [O4, O8  O10]  [O7, O9  (O3  O2)], RΛgr ⇔ [O1, O6  O5]  [O4, O8  O10]  [O7  O3  O2  O9]. The rankings of objects according to the methods of ARAMIS, weighted sum of estimates, lexicographic ordering can also be considered as the judgments of three other experts. We will combine the opinions of these new experts using the Borda voting procedure [7], according to which the order of objects in the final group ordering is given by the sum b(Op) of Borda points (Table 3) The generalized group ranking of objects, that combines the rankings RΑgr, RΣ gr, RΛgr, has the form: RΒ gr ⇔ [O1, O6  O5]  [O4, O8  O10]  [O7  (O9  O3)  O2]. Near objects are enclosed in the round brackets, groups of distant objects are enclosed in the square brackets. Thus, the final orderings of objects obtained in different ways, or, equivalently, the collective preferences of many different groups of experts (several versions of objects, schemes for aggregating attributes, methods for choosing objects) almost completely coincide, with the exception of small differences in the placements of objects included in the last group. In all the ratings, there are identical groups of “good objects” O1, O6, O5 with the high marks, “medium objects” O4, O8, O10 with the middle marks, almost identical groups of “bad objects” O7, O9, O3 with the low marks. According to the aggregated estimates of all experts upon all attributes, the best objects are O1, O6, which take the first places in all rankings. The worst object is O2, which takes the last places in three rankings and the penultimate place in one ranking. The gaps between the groups “good objects”, “medium objects” and “bad objects” are clearly expressed. Therefore, group orderings of objects can also be considered as group ordinal classifications, where the classes of objects and the places of objects into the classes are given by the corresponding rankings. Exactly the same rankings of objects O1,...,O10 were obtained by other author’s method for reducing the dimension of attribute space [10]. In practical problems of multi-criteria choice, a DM/expert may encounter inconsistency and contradiction of the results. Such situations can be caused by various reasons, in particular, the formal combination of attributes, the unsuccessful formation of gradations on the scales of composite criteria and integral index, or poor semantic relationships between the attributes and indicators. 6. Conclusions Solving the problems of multicriteria choice in the reduced attribute spaces significantly diminishes the labor costs of a decision maker/expert and substantively explains the choice made. The new method SOCRATES for reducing the dimension of the attributes space has certain universality, as it allows to operate simultaneously with both verbal (qualitative) and numerical (quantitative) data. An attractive feature of the method is the possibility to use it in combination with various decision- making methods and information processing technologies. And most attractively, the initially available information is not distorted or lost. The proposed method SOCRATES is easily integrated into the original multi-stage technology PAKS (Progressive Aggregation of the Classified Situations) and multi-method technology PAKS-M (Progressive Aggregation of the Classified Situations by many Methods) [10] for solving problems of multi-criteria choice in large-dimensional spaces. These technologies provide greater validity for choosing the most preferable object and have the following important features. They form several schemes with different options for aggregating attributes, in which the gradations of the scale of a composite indicator are presented as combinations of grades of the original attributes. The problem considered is solved by several methods of multi-criteria choice. An understandable explanation of the obtained results helps a DM/expert to find the most suitable scheme for aggregating attributes, or to apply several schemes together. Technologies for solving multi-criteria choice problems in large-dimensional spaces were used to evaluate the results of scientific research, rate organizations by the effectiveness of activities, select a prospective personal computing complex [10]. The use of the new SOCRATES method will vastly reduce the complexity and time of solving similar practical problems. Acknowledgements This work was supported by the Russian Foundation for Basic Research (projects 17-29-07021, 18-07-00132, 18-07-00280, 19-29-01047) References [1] S.A. Ayvazyan, V.M. Bukhshtaber, I.S. Enyukov, L.D. Meshalkin, Applied Statistics. Classification and Reduction of Dimension, Finansy i statistika, Moscow, 1989, in Russian. [2] V.A. Glotov, V.V. Pavel’yev, Vector Stratification, Nauka, Moscow, 1984, in Russian. [3] J.A. Hartigan, Clustering Algorithms. Wiley, New York, NY, 1975. [4] D. Kahneman, P. Slovik, A. Tversky, Decision-Making in Uncertainty: Heuristics and Biases, Cambridge University Press, Cambridge, 1982. [5] O.I. Larichev, Verbal Decision Analysis, Nauka, Moscow, 2006, in Russian. [6] V.D. Noghin, Reduction of the Pareto Set. An Axiomatic Approach, volume 126 of Studies in Systems, Decision and Control, Springer International Publishing, 2018. doi:10.1007/978-3- 319-67873-3. [7] A.B. Petrovsky, Decision Making Theory, Publishing Center “Academiya”, Moscow, 2009, in Russian. [8] A.B. Petrovsky, Group verbal decision analysis, in: F. Adam, P. Humphreys (Eds.), Encyclopedia of Decision Making and Decision Support Technologies, IGI Global, Hershey, NY, 2008, pp. 418-425. [9] A.B. Petrovsky, Indicators of similarity and differences of multi-attribute objects in the metric spaces of sets and multisets, Scientific and Technical Information Processing, 45(5) (2018) 331- 345. doi:10.3103/S0147688218050052. [10] A.B. Petrovsky, Group Verbal Decision Analysis, Nauka, Moscow, 2019, in Russian. [11] H. Samet, Foundation of Multidimensional and Metric Data Structures, Elsevier, Boston, 2006.