Distances Parameterized by Size: Models and Adaptation Techniques Archil Maysuradze1[0000−0002−1757−556X] and Daria Petrenko1[0000−0003−4306−9035] Lomonosov Moscow State University, Moscow, Russia maysuradze@cs.msu.ru daria petrenko@bk.ru Abstract. In many domains, the concept of distance is used for initial formulation and subsequent formalization of problems and solution meth- ods. However, for an adequate representation of complex situations, the traditional concept of distance is insufficient, and more expressive fam- ilies of models are required. In this paper, we propose and investigate theoretically and empirically one of the families — distances parameter- ized by size. We also introduce the generalized metric axioms as a set of natural requirements in many domains. As examples of applied domains, we can consider transport systems, in which the transportation time de- pends on the mass of the cargo, or message passing networks, in which the transfer delay depends on the length of the message. The number of combinations of pairs of object and sizes is huge, so the complete de- scription of all the situations is data intensive. The problem of modelling and approximating the collected dissimilarity tensor is posed and solved in various ways. Several models of distances parameterized by size are proposed in the work. For each of the models, sufficient conditions are found on the parameters (theorems on sufficient conditions) that ensure the fulfillment of all the generalized metric axioms. To adapt each of the models, we propose a specific method of conditional optimization. The idea of methods is in iterative conditional minimization of the vari- ational upper bound for the stress function. All the proposed models and methods were implemented and tested on real data on message passing delays between processes in the Lomonosov supercomputer system. Ex- periments have shown a good quality of approximation for models with a small number of parameters (that is, a high degree of data compression), as well as comparability of losses with unconditional problem statements in which the generalized metric axioms are ignored. Keywords: Data models · Distance modeling · Data compression · Met- ric Extraction · MPI time delays 1 Introduction Nowadays there is a large number of subject areas in which a part of a modeled system by its properties resembles a transport system, and the researcher is Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 174 interested in some conditional laboriousness of transition from object to object. For example, in case of tasks related to trucking, objects correspond to cities, and labour intensity to time or cost of cargo transportation between them. Another example is a computing cluster in which objects correspond to processes, and laboriousness to message passing time delays. One can see that for the provided examples the laboriousness of transition between objects is actually determined not only by the objects themselves. In the first example it can also depends on the mass of the transported cargo, in the second one — on the message length. Therefore, in order to formally describe such a laboriousness, the traditional definition of distance as a function of two objects is not enough. We suggest to use a function of three arguments, two of which take objects, and the third one — some conditional “cargo size”. Such a function we propose to call a distance parameterized by size. Note that the majority of normally functioning transport systems meet a natural set of requirements connecting various objects, sizes and distances. For example, speaking about the task of trucking, in the case of one particular cargo transportation, metric axioms become a mathematical formalization of such re- quirements. The transportation time of any cargo is non-negative, and if the cities of departure and arrival coincide, then the transfer time equals to zero regardless of the cargo size. For any pair of cities the time of cargo transporta- tion between them does not depend on which of the cities was a starting point and which was a final one. While trucking it is disadvantageous to leave the shortest route to visit an additional object that doesn’t require to be visited. The requirements get more complicated when the goods of different sized are transported. A heavy cargo takes more time to transfer, and transportation of any cargo as a whole takes no more time than sequentially in parts. Non-compliance of a transport system with any of the described require- ments may indicate the presence of malfunctions or potential opportunities of routing improvement. Therefore, it is important to be able to verify the fulfill- ment of the specified requirements and to estimate how much the system differs from the “correct” one. Accordingly, the formalization of concept of distance paramererized by size itself gives us a tool for quantitative measurement of sys- tem quality. While collecting initial information on transport systems almost always a large amount of data appears. In its original form, it is hard to store and analyze. Therefore, in this paper we propose and investigate approaches to modeling such systems, in which the initial information is significantly “compressed” by approximation by interpreted models of distances parameterized by size. The problems of modeling distances by various formal systems in the liter- ature are sometimes called distance realization problems [2]. In particular, the problems of approximating data on differences in continuous spaces belong to a class of multidimensional scaling problems [1, 3, 4]. The works on multidimen- sional scaling also describe situations when for the same pair of objects there exist several distance values(e.g. individual differences models from [3], three- way MDS models from [1]). An example is obtaining information on objects 175 similarity from several independent experts. However, the existing works do not assume the presence of any relations or operations on a set of quantities describ- ing various distances for a fixed pair of objects. This paper substantially uses the fact that the set of sizes has an order relation and an addition operation. Therefore, the proposed approach to distance modeling is fundamentally new. The rest of this paper is structured as follows. In section 2 we provide a formal definition of distance parameterized by size and some of its expansions and contractions. In sections 3 and 4 we describe specific models of distances parameterized by size and methods of their adaptation. In section 4 we describe our experimental setup and provide empirical results. 2 Basic Definitions and Notations Let us introduce a formal definition of distance parameterized by size. Let X be an arbitrary set, S — a partially ordered set with an addition operation. Definition 1. Distance parameterized by size is a function ρ(x1 , x2 , s) : X × X × S → R that satisfies the following system of axioms: – distance axioms ∀x1 , x2 ∈ X, ∀s ∈ S D1. ρ(x1 , x1 , s) = 0 (reflexivity) D2. ρ(x1 , x2 , s) = ρ(x2 , x1 , s) (symmetry) D3. ρ(x1 , x2 , s) ≥ 0 (non-negativity) – size axioms ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S S1. s1 ≤ s2 =⇒ ρ(x1 , x2 , s1 ) ≤ ρ(x1 , x2 , s2 ) (monotonicity) S2. ρ(x1 , x2 , s1 + s2 ) ≤ ρ(x1 , x2 , s1 ) + ρ(x1 , x2 , s2 ) (indivisibility) Similar to normal distances, the above definition of distances parameterized by size can be expanded by introduction of additional axioms. Definition 2. A function ρ : X × X × S → R is called a pseudometric param- eterized by size if it satisfies the definition 1 and ∀x1 , x2 , x3 ∈ X, ∀s ∈ S D4. ρ(x1 , x2 , s) ≤ ρ(x1 , x3 , s) + ρ(x2 , x3 , s) (triangle inequality) Definition 3. A function ρ : X × X × S → R is called a metric parameterized by size if it satisfies the definition 2 and ∀x1 , x2 ∈ X, ∀s ∈ S D5. ρ(x1 , x2 , s) = 0 ⇔ x1 = x2 (identity of indistinguishable) Due to measurement errors or peculiarities of systems functioning, the real collected data often do not satisfy some of the conditions described. The func- tions with a relaxed set of requirements in the case of normal distances are often referred to as dissimilarities. Therefore, in some cases in this paper we will talk about dissimilarities parameterized by size. Definition 4. A function ρ : X × X × S → R is called a dissimilarity parame- terized by size it satisfies the axioms of reflexitivity and non-negativity. 176 Let us call a function f : S → R subadditive if it satisfies ∀s1 , s2 ∈ S f (s1 + s2 ) ≤ f (s1 ) + f (s2 ) The set of natural numbers 1, . . . , N is denoted qP by [1, N ]. For the euclidean K 2 metric we will use the notation eucl(x, y) = k=1 (xk − yk ) . Let a finite set of objects of cardinality N and a finite set of sizes of cardinality K be given. Further, for simplicity, we will assume that objects are numbered from 1 to N and sizes are proportional to their indices. Let ∆ ∈ RN ×N ×K be a tensor such that δijk contains a measured initial difference between the i-th and the j-th objects parameterized by size with index k. 3 Proportional Distances Model Let us define a proportional distances model and prove that it satisfies all the axioms of distance parameterized by size. Then a method of its adaptation is described. 3.1 Model Definition Let S be a partially ordered set of sizes with an addition operation. Let X be a set with the defined distance dist. Let r be a function from sizes to real numbers. For x1 , x2 ∈ X, s ∈ S consider the function ρ(x1 , x2 , s) = r(s) dist(x1 , x2 ) (1) Theorem 1. If dist is a distance, and function r(s) is non-negative, monotonous and subadditive, i.e the following is satisfied ∀s ∈ S r(s) ≥ 0, (2) ∀s1 , s2 ∈ S s1 ≤ s2 ⇒ r(s1 ) ≤ r(s2 ), (3) ∀s1 , s2 ∈ S r(s1 + s2 ) ≤ r(s1 ) + r(s2 ), (4) then the function (1) satisfies all the axioms of distance parameterized by size. If dist is a pseudometric then ρ is a pseudometric parameterized by size. If dist is a metric and additionally r is positive then ρ is a metric parameterized by size. Proof. In [7] it is shown that multiplication of a distance (pseudometric) by a non-negative value guarantees the preservation of distance (pseudometric) ax- ioms. Multiplication of a metric by a positive value guarantees the preservation of metric axioms. Let us show the fulfillment of the monotonicity axiom (S1). Due to (3) and non-negativity of dist, for any pair of objects x1 , x2 ∈ X provided s1 , s2 ∈ S, s1 ≤ s2 the following is fulfilled ρ(x1 , x2 , s2 ) − ρ(x1 , x2 , s1 ) = r(s2 ) dist(x1 , x2 ) − r(s1 ) dist(x1 , x2 ) = = (r(s2 ) − r(s1 )) dist(x1 , x2 ) ≥ 0 177 Let us show the fulfillment of the indivisibility axiom (S2). Due to (4) and non-negativity of dist ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S ρ(x1 , x2 , s1 ) + ρ(x1 , x2 , s2 ) = r(s1 ) dist(x1 , x2 ) + r(s2 ) dist(x1 , x2 ) = (r(s1 ) + r(s2 )) dist(x1 , x2 ) ≥ r(s1 + s2 ) dist(x1 , x2 ) = ρ(x, y, s1 + s2 ) 3.2 Adaptation Method The proportional distances model with dist = eucl is considered. Let us look for the solution in the following form: let us build a uniform “group” configuration X = [x1 , . . . , xN ]T in a space of a given dimension L and a weight vector r ∈ RK . In order to ensure the fulfillment of the axioms of distance parameterized by size, the vector r can be required to satisfy the sufficient conditions of the theorem 1. That is, for the case of sizes being proportional to indices of the input tensor, it is suffice to require rk ≥ 0 ∀k ∈ [1, K] (5) rk2 − rk1 ≥ 0 ∀k1 ≤ k2 ∈ [1, K] (6) rk1 + rk2 − rk1 +k2 ≥ 0 ∀k1 , k2 ∈ [1, K], k1 + k2 ≤ K (7) Let W be the predefined objects weights. Let the optimal solution be the one that minimizes the stress function X n X s X n S(r, X) = wijk (rk eucl(xi , xj ) − δijk )2 (8) k=1 i=1 j=1 and satisfies the conditions (5) — (7). We will optimize using a coordinate descent method with respect to X and r, minimizing with respect to one group of variables while fixing the values of another one. Let r[t] , X [t] be the values of parameters at the t-th step of iterative process. Let us fix the value of X. Taking into account the requirements (5) — (7), we get a problem of minimization of a quadratic function with respect to r with linear constraints. The optimal values of parameters can be found using the appropriate methods of constrained optimization. E.g. the active set method can be used [8]. It should be noted that with the change in X only the target function changes in the problem with respect to r, and the constraints remain unchanged. Therefore, the previous approximation r[t] remains in the feasible region after the iteration over X and can be used as a new initial approximation during the next step. For a fixed r we will optimize with respect to X using the SMACOF algorithm [3]. The stress function can be expressed as S(r, X) = const(X, r) + tr(X T V X) − 2 tr(X T B(X)X) (9) 178 n X X n s X rk2 wijk Aij assuming Aij = (ei −ej )(ei −ej )T (10)  where V = i=1 j=1 k=1  Ps N X N  k=1 rk wijk δijk , eucl(x , x ) ̸= 0 i j X B(X) = sij Aij , sij (X) = eucl(xi , xj ) 0, eucl(xi , xj ) = 0  i=1 j=1 (11) The variational upper bound for it is the function T (r, X, Y ) = const(X, r) + tr(X T V X) − 2 tr(X T B(Y )Y ), (12) The variational upper bound can be minimized as follows: step 1: Y [t+1] : T (r, Y [t+1] , X [t] ) = S(r, X [t] ), that is, Y [t+1] = X [t] step 2: X [t+1] = arg minX T (r, X, Y [t+1] ) T(r, X, Y) is quadratic with respect to X, therefore, it reaches a minimum with respect to X at a single point where the derivative equals to 0. ∂T (r, X, Y ) = 2V X − 2B(Y )Y = 0 ∂X The matrix V is degenerate as the sum of its elements in each row and in each column equals to 0. We will use the Moore-Penrose inversion. Let V + be a pseudoinverse of V . Then X̂ = V + B(Y )Y (13) Combining the two steps, the next approximation of X can be obtained by the equation: X [t+1] = V + B(X [t] )X [t] (14) 4 Individual Differences Model Let us, similarly, give a definition of individual differences model and prove that it satisfies the axioms of distance parameterized by size. We also provide a method of its adaptation. 179 4.1 Model Definition Now let X be a set with defined distance dist such that its elements are vectors of length L. Let S be a partially ordered set of sizes with an addition operation. Let r : S → RL be a function from sizes to vectors of real numbers of length L, and for all elements of X an operation of multiplication by real matrix ∈ RL×L is defined. Then ∀x1 , x2 ∈ X, ∀s ∈ S consider a function   ρ(x1 , x2 , s) = dist x1 × diag(r(s)), x2 × diag(r(s)) (15) One may state that if dist is a distance (a pseudometric), then for all r function ρ satisfies the axioms of distance (pseudometric). Multiplication of a vector by a diagonal matrix is equivalent to multiplication of its i-th component by the i-th diagonal element ∀i ∈ [1, L]. Such a transformation converts different elements of X to equal (with zero components corresponding to ri (s) = 0) or different elements. If dist is a metric, then if ri (s) ̸= 0 ∀s ∈ S, ∀i ∈ [1, L], then ρ satisfies the metric axioms. Multiplication of each component of a vector by a non-zero scalar converts different vectors to different ones. The constraints that should be imposed on a function r to ensure the fulfill- ment of size axioms depend on a function dist. Let us consider a special case. Theorem 2. If in individual differences model dist = eucl, and r is non-negative, monotonous and subadditive, i.e it satisfies ∀s1 , s2 ∈ S r(s1 ) ≥ 0 (16) s1 ≤ s2 ⇒ r(s1 ) ≤ r(s2 ) (17) r(s1 + s2 ) ≤ r(s1 ) + r(s2 ) (18) then ρ satisfies the size axioms (S1) − (S2). Proof. Let us show the fulfillment of the monotonicity axiom (S1). Let x1 , x2 ∈ X, s1 , s2 ∈ S be arbitrary elements. It is true that s1 ≤ s2 ⇒ {(17)} ⇒ r(s1 ) ≤ r(s2 ) ⇒ {(16)} ⇒ r2 (s1 ) ≤ r2 (s2 ) (19) Since the function eucl is non-negative, it is also true that   ρ(x1 , x2 , s1 ) ≤ ρ(x1 , x2 , s2 ) ⇔ eucl x1 × r(s1 ), x2 × r(s1 ) ≤   ≤ eucl x1 × r(s2 ), x2 × r(s2 ) ⇔ ρ2 (x1 , x2 , s1 ) ≤ ρ2 (x1 , x2 , s2 ) (20) 180 Then ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S, s1 ≤ s2 L L X 2 X 2 ρ2 (x1 , x2 , s1 )−ρ2 (x1 , x2 , s2 ) = rl2 (s1 ) x1l −x2l − rl2 (s2 ) x1l −x2l = l=1 l=1 L X 2 rl2 (s1 ) − rl2 (s2 ) x1l − x2l ≤ {(19)} ≤ 0  = l=1 And according to (20) it is true that ρ2 (x1 , x2 , s1 ) − ρ2 (x1 , x2 , s2 ) ≤ 0 ⇔ ρ(x1 , x2 , s1 ) − ρ(x1 , x2 , s2 ) ≤ 0 Now let us show the fulfillment of the axiom (S2). ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S L 2 X 2 ρ (x1 , x2 , s1 + s2 ) = rl2 (s1 + s2 ) x1l − x2l ≤ {(16), (18)} ≤ l=1 L L X 2 2 X 2 rl2 (s1 ) + rl2 (s2 ) x1l − x2l +  ≤ rl (s1 ) + rl (s2 ) x1l − x2l = l=1 l=1 L L X 2 X 2 rl2 (s1 ) + rl2 (s2 ) x1l − x2l  +2 rl (s1 )rl (s2 ) x1l − x2l ≤ {(16)} ≤ l=1 l=1 Since both expressions are non-negative, it is true that L X 2 ρ2 (x1 , x2 , s1 + s2 ) ≤ rl2 (s1 ) + rl2 (s2 ) x1l − x2l ⇔  l=1 v u L uX  2 ⇔ ρ(x1 , x2 , s1 + s2 ) ≤ t rl2 (s1 ) + rl2 (s2 ) x1l − x2l l=1 Since the square root of a sum of non-negative elements does not exceed the sum of roots of these elements, then v u L uX  2 ρ(x1 , x2 , s1 + s2 ) ≤ t rl2 (s1 ) + rl2 (s2 ) x1l − x2l ≤ l=1 v v u L u L 2 2 uX  uX ≤ t 2 rl (s1 ) x1l − x2l + t rl2 (s2 ) x1l − x2l = l=1 l=1 = ρ(x1 , x2 , s1 ) + ρ(x1 , x2 , s2 ) 4.2 Adaptation Method The individual differences model with dist = eucl is considered. Let us look for the solution in the following form: let us build a uniform “group” configuration 181 X = [x1 , . . . , xN ]T in a space of a given dimension L, and a set of weight vectors r = [r1 , . . . , rK ]T ∈ RK×L . In order to ensure the fulfillment of the axioms of distance parameterized by size, the vectors r can be required to satisfy the suffi- cient conditions of the theorem 2. That is, for the case of sizes being proportional to indices of the input tensor, it is suffice to require rk ≥ 0 ∀k ∈ [1, K] (21) rk2 − rk1 ≥ 0 ∀k1 ≤ k2 ∈ [1, K] (22) rk1 + rk2 − rk1 +k2 ≥ 0 ∀k1 , k2 ∈ [1, K], k1 + k2 ≤ K (23) Arithmetic operations and comparisons are element-wise. We will consider the optimal solution the one that minimizes the stress func- tion K X X N X N  2  S(X, r) = wijk eucl xi × diag(rk ), xj × diag(rk ) − δijk (24) k=1 i=1 j=1 and satisfies the conditions (21) — (23). We will optimize using a modified generalization of SMACOF method for weighted euclidean model [3]. Let us denote Xk = X × diag(rk ). The stress function can be rewritten as follows: K X const(X, r) + tr(XkT Vk Xk ) − 2 tr(XkT B(Xk )Xk )  S(X, r) = k=1 Let Y = {Y1 , . . . , YK }. A variational upper bound for the stress function is K X const(X, r) + tr(XkT Vk Xk ) − 2 tr(XkT B(Yk )Yk )  T (X, Y, r) = (25) k=1 Let X k = Vk+ B(Yk )Yk . Then K X T const(X, r)+tr (Xk −X k )T Vk (Xk −X k ) −tr X k Vk X k   T (X, Y, r) = k=1 We will minimize the variational lower bound, similarly to the method of proportional distances model adaptation, in two steps. Since T (X, {X × diag(r1 ), . . . , X × diag(rK )}, r) = S(X, r), [t+1] [t] then Yk = Xk . For any fixed Y only the second term depends on X, r, and the equation is quadratic with respect to these variables. The function can be minimized with respect to one group of parameters for fixed values of others. Minimization with 182 respect to X can be performed by the least squares method, and with respect to r by one of the methods of optimization of quadratic equation with linear con- straints (e.g. by the active set method). It should be noted that the optimization with respect to r can be performed independently for each dimension. Combining the two steps, we get [t] [t] X k = Vk+ B(Xk )Xk (26) K X X [t+1] , r[t+1] = arg min tr (X×diag(rk )−X k )T Vk (X×diag(rk )−X k ) (27)  X,r k=1 5 Experiments The experiments were carried out on data on message passing delays in the Lomonosov supercomputer system. Note that the methods of collecting this in- formation themselves require significant scientific and technological efforts. One can read about the methods of collecting and primary processing of the men- tioned data in the papers [5, 6, 9]. In parallel programming using the MPI standard, the program is divided into processes that can interact with each other by exchanging messages. The information about message passing delays is useful to collect and model, as it can help to improve the efficiency of the computing system, in particular, to solve the problems of dynamic scheduling of programs execution, as well as to diagnose the communication environment. In our experiment the data was collected for 78 processes and message lengths in range from 0 to 10000 in increments of 100 bytes. For every ordered pair of processes and every message length several measurements of delays were taken, after which the median of the obtained empirical distribution was calculated. As a result, a tensor ∆ ∈ R78×78×100 of dissimilarities parameterized by size was obtained. To assess how real data correspond to the proposed axiomatic model, for each axiom, all pairs (triples) of values were selected from the input tensor, for which it is correct to check its fulfillment. Then the fraction of pairs (triplets), for which this axiom is incorrect, was calculated. The obtained values are shown in table 1. One can see that the axioms are violated only for an insignificant part of the data, which indicates that the proposed concept quite well describes real systems. Table 1. Fraction of data that does not satisfy the axioms axiom D1 D2 D3 S1 S2 fraction 0 0.181 0 0.008 0.0005 183 104 PD model constrained PD model unconstrained ID model constrained 103 ID model unconstrained normalized stress S* 10 1 102 time, s 101 100 PD model constrained PD model unconstrained ID model constrained 10 2 10 1 ID model unconstrained 1 2 3 4 5 6 7 8 9 10 15 20 1 2 3 4 5 6 7 8 9 10 15 20 parameter L parameter L Fig. 1. Normalized stress S ∗ value de- Fig. 2. Models tuning time depending on pending on parameter L parameter L value During further experiments, the proportional distances (PD) model and in- dividual differences (ID) model were adapted to data for different values of L — the solution space dimension. For comparison, these models were also tuned without regard to parameters restrictions. Under this condition they equal to the identity model and the weighted euclidean model from [1], and their adaptation methods coincide with SMACOF generalizations for these models. Figure 1 shows the final values of normalized stress function S(X, r) S ∗ (X, r) = PK PN PN (28) 2 k=1 i=1 j=1 δijk for PD and ID models. The stress values for the conditional and unconditional optimization of the models are indistinguishable, thus they look like 2 lines. Figure 2 contains the plots of the dependence of models tuning time on the values of L. It can be noticed that for the adapted models the normalized stress func- tion takes small (about 10−2 ) values, which means that the real system is close to the “correct” one. This is also evidenced by the similarity of stress values for conditional and unconditional models optimization: the requirements for the parameters do not impose significant restrictions on the quality of model adap- tation. Small values of stress are achieved for rather low values of L, that is, the input tensor allows efficient compression by the proposed models. The values of stress function for two models substantially differ only for large values of L, and the use of individual differences model can be redundant for our data. While the number of parameters and tuning time for this model is significantly larger. 184 PD model 0.05 PD model fraction of non-fulfillment of monotonicity fraction of non-fulfillment of S1 axiom ID model ID model 0.40 0.04 0.35 property for r 0.30 0.03 0.25 0.02 0.20 0.15 0.01 1 2 3 4 5 6 7 8 9 10 15 20 1 2 3 4 5 6 7 8 9 10 15 20 parameter L parameter L Fig. 3. Fraction of non-fulfillment of Fig. 4. Fraction on non-fulfillment of S1 monotonicity property for parameter r axiom depending on the value of parame- depending on parameter L value for un- ter L value for unconstrained models constrained models The results of constrained models adaptation satisfy all the axioms of dis- tance parameterized by size. For the results of unconstrained adaptation, the fulfillment of the distance axioms is guaranteed for any parameter values. For the axioms of size, we measured the fraction of the resulting tensor for which these axioms are not fulfilled. The fraction of the vector (matrix) of parame- ters r that does not satisfy the conditions of non-negativity, monotonicity and subadditivity was also measured. Non-negative values were obtained only for monotonicity constraint and S1 axiom, the plots of dependence on L are shown in figures 3 and 4. 6 Conclusion We have formally introduced the concept of ’distances parameterized by size’ and proposed several specific models of such distances. For each model we have found the sufficient conditions on the parameters that guarantee the fulfillment of all the axioms from the definition, and have provided a specific method of model adaptation. We have demonstrated empirically that the proposed models allow to approximate real data with a high degree of compression. Acknowledgements. The study was partially supported by RFBR (project no. 20-01-00664-a) and state-financed research work no. 5.1.21 of Lomonosov Moscow State University. References 1. Borg, I., Groenen, P.J.F.: Modern multidimensional scaling: Theory and applica- tions. Springer Science & Business Media (2005) 2. Chung, F., Garrett, M., Graham, R., Shallcross, D.: Distance realization problems with applications to internet tomography. Journal of Computer and System Sciences 63(3), 432–448 (2001) 185 3. Cox, T.F., Cox, M.A.: Multidimensional Scaling. New York: Chapman and Hall/CRC (2000) 4. Davison, M.L.: Multidimensional scaling: methods of data visualization. (in Rus- sian). Finance and Statistics (1988) 5. Kozlov, V., Maysuradze, A.: Separation of a mixture of three-parameter lognor- mal distributions in the analysis of communication environments. Computational Mathematics and Modeling 30(3), 311–319 (2019) 6. Maysuradze, A., Kozlov, V.: Modeling message passing delays in a computer cluster to monitor its network. In: CEUR Workshop Proceedings. pp. 93–99 (2017) 7. Maysuradze, A.I.: Homogeneous and rank bases in spaces of metric configurations (in russian). Journal of Computational Mathematics and Mathematical Physics 46(2), 344–360 (2006) 8. Nocedal, J., Wright, S.: Numerical optimization. Springer Science & Business Media (2006) 9. Salnikov, A., Begaev, A., Maysuradze, A.: Similarity mining of message passing delays in supercomputer networks based on peak and step detection. In: Russian Supercomputing Days. pp. 486–499. Springer (2020) 186