Distances Parameterized by Size: Models and
               Adaptation Techniques

               Archil Maysuradze1[0000−0002−1757−556X] and Daria
                         Petrenko1[0000−0003−4306−9035]

                Lomonosov Moscow State University, Moscow, Russia
                            maysuradze@cs.msu.ru
                            daria petrenko@bk.ru


      Abstract. In many domains, the concept of distance is used for initial
      formulation and subsequent formalization of problems and solution meth-
      ods. However, for an adequate representation of complex situations, the
      traditional concept of distance is insufficient, and more expressive fam-
      ilies of models are required. In this paper, we propose and investigate
      theoretically and empirically one of the families — distances parameter-
      ized by size. We also introduce the generalized metric axioms as a set of
      natural requirements in many domains. As examples of applied domains,
      we can consider transport systems, in which the transportation time de-
      pends on the mass of the cargo, or message passing networks, in which
      the transfer delay depends on the length of the message. The number
      of combinations of pairs of object and sizes is huge, so the complete de-
      scription of all the situations is data intensive. The problem of modelling
      and approximating the collected dissimilarity tensor is posed and solved
      in various ways. Several models of distances parameterized by size are
      proposed in the work. For each of the models, sufficient conditions are
      found on the parameters (theorems on sufficient conditions) that ensure
      the fulfillment of all the generalized metric axioms. To adapt each of
      the models, we propose a specific method of conditional optimization.
      The idea of methods is in iterative conditional minimization of the vari-
      ational upper bound for the stress function. All the proposed models and
      methods were implemented and tested on real data on message passing
      delays between processes in the Lomonosov supercomputer system. Ex-
      periments have shown a good quality of approximation for models with a
      small number of parameters (that is, a high degree of data compression),
      as well as comparability of losses with unconditional problem statements
      in which the generalized metric axioms are ignored.

      Keywords: Data models · Distance modeling · Data compression · Met-
      ric Extraction · MPI time delays


1    Introduction

Nowadays there is a large number of subject areas in which a part of a modeled
system by its properties resembles a transport system, and the researcher is


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).


                                           174
interested in some conditional laboriousness of transition from object to object.
For example, in case of tasks related to trucking, objects correspond to cities, and
labour intensity to time or cost of cargo transportation between them. Another
example is a computing cluster in which objects correspond to processes, and
laboriousness to message passing time delays.
    One can see that for the provided examples the laboriousness of transition
between objects is actually determined not only by the objects themselves. In
the first example it can also depends on the mass of the transported cargo, in the
second one — on the message length. Therefore, in order to formally describe
such a laboriousness, the traditional definition of distance as a function of two
objects is not enough. We suggest to use a function of three arguments, two of
which take objects, and the third one — some conditional “cargo size”. Such a
function we propose to call a distance parameterized by size.
    Note that the majority of normally functioning transport systems meet a
natural set of requirements connecting various objects, sizes and distances. For
example, speaking about the task of trucking, in the case of one particular cargo
transportation, metric axioms become a mathematical formalization of such re-
quirements. The transportation time of any cargo is non-negative, and if the
cities of departure and arrival coincide, then the transfer time equals to zero
regardless of the cargo size. For any pair of cities the time of cargo transporta-
tion between them does not depend on which of the cities was a starting point
and which was a final one. While trucking it is disadvantageous to leave the
shortest route to visit an additional object that doesn’t require to be visited.
The requirements get more complicated when the goods of different sized are
transported. A heavy cargo takes more time to transfer, and transportation of
any cargo as a whole takes no more time than sequentially in parts.
    Non-compliance of a transport system with any of the described require-
ments may indicate the presence of malfunctions or potential opportunities of
routing improvement. Therefore, it is important to be able to verify the fulfill-
ment of the specified requirements and to estimate how much the system differs
from the “correct” one. Accordingly, the formalization of concept of distance
paramererized by size itself gives us a tool for quantitative measurement of sys-
tem quality.
    While collecting initial information on transport systems almost always a
large amount of data appears. In its original form, it is hard to store and analyze.
Therefore, in this paper we propose and investigate approaches to modeling
such systems, in which the initial information is significantly “compressed” by
approximation by interpreted models of distances parameterized by size.
    The problems of modeling distances by various formal systems in the liter-
ature are sometimes called distance realization problems [2]. In particular, the
problems of approximating data on differences in continuous spaces belong to
a class of multidimensional scaling problems [1, 3, 4]. The works on multidimen-
sional scaling also describe situations when for the same pair of objects there
exist several distance values(e.g. individual differences models from [3], three-
way MDS models from [1]). An example is obtaining information on objects


                                       175
 similarity from several independent experts. However, the existing works do not
 assume the presence of any relations or operations on a set of quantities describ-
 ing various distances for a fixed pair of objects. This paper substantially uses
 the fact that the set of sizes has an order relation and an addition operation.
 Therefore, the proposed approach to distance modeling is fundamentally new.
    The rest of this paper is structured as follows. In section 2 we provide a
 formal definition of distance parameterized by size and some of its expansions
 and contractions. In sections 3 and 4 we describe specific models of distances
 parameterized by size and methods of their adaptation. In section 4 we describe
 our experimental setup and provide empirical results.


 2    Basic Definitions and Notations

 Let us introduce a formal definition of distance parameterized by size. Let X be
 an arbitrary set, S — a partially ordered set with an addition operation.
 Definition 1. Distance parameterized by size is a function ρ(x1 , x2 , s) : X ×
 X × S → R that satisfies the following system of axioms:

 – distance axioms ∀x1 , x2 ∈ X, ∀s ∈ S
D1. ρ(x1 , x1 , s) = 0 (reflexivity)
D2. ρ(x1 , x2 , s) = ρ(x2 , x1 , s) (symmetry)
D3. ρ(x1 , x2 , s) ≥ 0 (non-negativity)
 – size axioms ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S
S1. s1 ≤ s2 =⇒ ρ(x1 , x2 , s1 ) ≤ ρ(x1 , x2 , s2 ) (monotonicity)
S2. ρ(x1 , x2 , s1 + s2 ) ≤ ρ(x1 , x2 , s1 ) + ρ(x1 , x2 , s2 ) (indivisibility)

    Similar to normal distances, the above definition of distances parameterized
 by size can be expanded by introduction of additional axioms.

 Definition 2. A function ρ : X × X × S → R is called a pseudometric param-
 eterized by size if it satisfies the definition 1 and ∀x1 , x2 , x3 ∈ X, ∀s ∈ S

D4. ρ(x1 , x2 , s) ≤ ρ(x1 , x3 , s) + ρ(x2 , x3 , s)   (triangle inequality)

 Definition 3. A function ρ : X × X × S → R is called a metric parameterized
 by size if it satisfies the definition 2 and ∀x1 , x2 ∈ X, ∀s ∈ S

D5. ρ(x1 , x2 , s) = 0 ⇔ x1 = x2         (identity of indistinguishable)

     Due to measurement errors or peculiarities of systems functioning, the real
 collected data often do not satisfy some of the conditions described. The func-
 tions with a relaxed set of requirements in the case of normal distances are often
 referred to as dissimilarities. Therefore, in some cases in this paper we will talk
 about dissimilarities parameterized by size.

 Definition 4. A function ρ : X × X × S → R is called a dissimilarity parame-
 terized by size it satisfies the axioms of reflexitivity and non-negativity.


                                              176
      Let us call a function f : S → R subadditive if it satisfies ∀s1 , s2 ∈ S
                                 f (s1 + s2 ) ≤ f (s1 ) + f (s2 )
   The set of natural numbers 1, . . . , N is denoted
                                                qP by [1, N ]. For the euclidean
                                                     K              2
metric we will use the notation eucl(x, y) =         k=1 (xk − yk ) .
   Let a finite set of objects of cardinality N and a finite set of sizes of cardinality
K be given. Further, for simplicity, we will assume that objects are numbered
from 1 to N and sizes are proportional to their indices. Let ∆ ∈ RN ×N ×K be a
tensor such that δijk contains a measured initial difference between the i-th and
the j-th objects parameterized by size with index k.

3      Proportional Distances Model
Let us define a proportional distances model and prove that it satisfies all the
axioms of distance parameterized by size. Then a method of its adaptation is
described.

3.1     Model Definition
Let S be a partially ordered set of sizes with an addition operation. Let X be a
set with the defined distance dist. Let r be a function from sizes to real numbers.
For x1 , x2 ∈ X, s ∈ S consider the function
                                ρ(x1 , x2 , s) = r(s) dist(x1 , x2 )                           (1)
Theorem 1. If dist is a distance, and function r(s) is non-negative, monotonous
and subadditive, i.e the following is satisfied
                                        ∀s ∈ S    r(s) ≥ 0,                                    (2)
                          ∀s1 , s2 ∈ S     s1 ≤ s2 ⇒ r(s1 ) ≤ r(s2 ),                          (3)
                         ∀s1 , s2 ∈ S     r(s1 + s2 ) ≤ r(s1 ) + r(s2 ),                       (4)
then the function (1) satisfies all the axioms of distance parameterized by size. If
dist is a pseudometric then ρ is a pseudometric parameterized by size. If dist is
a metric and additionally r is positive then ρ is a metric parameterized by size.
Proof. In [7] it is shown that multiplication of a distance (pseudometric) by a
non-negative value guarantees the preservation of distance (pseudometric) ax-
ioms. Multiplication of a metric by a positive value guarantees the preservation
of metric axioms.
   Let us show the fulfillment of the monotonicity axiom (S1). Due to (3) and
non-negativity of dist, for any pair of objects x1 , x2 ∈ X provided s1 , s2 ∈
S, s1 ≤ s2 the following is fulfilled

    ρ(x1 , x2 , s2 ) − ρ(x1 , x2 , s1 ) = r(s2 ) dist(x1 , x2 ) − r(s1 ) dist(x1 , x2 ) =
                                                          = (r(s2 ) − r(s1 )) dist(x1 , x2 ) ≥ 0


                                                 177
   Let us show the fulfillment of the indivisibility axiom (S2). Due to (4) and
non-negativity of dist ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S

  ρ(x1 , x2 , s1 ) + ρ(x1 , x2 , s2 ) = r(s1 ) dist(x1 , x2 ) + r(s2 ) dist(x1 , x2 ) =
          (r(s1 ) + r(s2 )) dist(x1 , x2 ) ≥ r(s1 + s2 ) dist(x1 , x2 ) = ρ(x, y, s1 + s2 )


3.2   Adaptation Method

The proportional distances model with dist = eucl is considered. Let us look for
the solution in the following form: let us build a uniform “group” configuration
X = [x1 , . . . , xN ]T in a space of a given dimension L and a weight vector r ∈ RK .
In order to ensure the fulfillment of the axioms of distance parameterized by size,
the vector r can be required to satisfy the sufficient conditions of the theorem
1. That is, for the case of sizes being proportional to indices of the input tensor,
it is suffice to require
                                    rk ≥ 0 ∀k ∈ [1, K]                             (5)

                           rk2 − rk1 ≥ 0 ∀k1 ≤ k2 ∈ [1, K]                                (6)

              rk1 + rk2 − rk1 +k2 ≥ 0 ∀k1 , k2 ∈ [1, K], k1 + k2 ≤ K                      (7)

Let W be the predefined objects weights. Let the optimal solution be the one
that minimizes the stress function
                             X n X
                             s X n
                 S(r, X) =                 wijk (rk eucl(xi , xj ) − δijk )2              (8)
                             k=1 i=1 j=1


and satisfies the conditions (5) — (7).
     We will optimize using a coordinate descent method with respect to X and
r, minimizing with respect to one group of variables while fixing the values of
another one. Let r[t] , X [t] be the values of parameters at the t-th step of iterative
process.
     Let us fix the value of X. Taking into account the requirements (5) — (7),
we get a problem of minimization of a quadratic function with respect to r with
linear constraints. The optimal values of parameters can be found using the
appropriate methods of constrained optimization. E.g. the active set method
can be used [8]. It should be noted that with the change in X only the target
function changes in the problem with respect to r, and the constraints remain
unchanged. Therefore, the previous approximation r[t] remains in the feasible
region after the iteration over X and can be used as a new initial approximation
during the next step.
     For a fixed r we will optimize with respect to X using the SMACOF algorithm
[3]. The stress function can be expressed as

              S(r, X) = const(X, r) + tr(X T V X) − 2 tr(X T B(X)X)                       (9)


                                            178
                 n X
                 X n s
                     X
                                  rk2 wijk Aij    assuming Aij = (ei −ej )(ei −ej )T (10)
                                          
 where V =
                 i=1 j=1    k=1


                                         Ps
             N X
               N                         k=1 rk wijk δijk , eucl(x , x ) ̸= 0
                                                                   i    j
             X
  B(X) =           sij Aij , sij (X) =       eucl(xi , xj )
                                          0,                 eucl(xi , xj ) = 0
                                        
           i=1 j=1
                                                                              (11)
The variational upper bound for it is the function

             T (r, X, Y ) = const(X, r) + tr(X T V X) − 2 tr(X T B(Y )Y ),           (12)

The variational upper bound can be minimized as follows:

step 1: Y [t+1] : T (r, Y [t+1] , X [t] ) = S(r, X [t] ), that is, Y [t+1] = X [t]

step 2: X [t+1] = arg minX T (r, X, Y [t+1] )

   T(r, X, Y) is quadratic with respect to X, therefore, it reaches a minimum
with respect to X at a single point where the derivative equals to 0.

                           ∂T (r, X, Y )
                                         = 2V X − 2B(Y )Y = 0
                               ∂X
   The matrix V is degenerate as the sum of its elements in each row and in
each column equals to 0. We will use the Moore-Penrose inversion. Let V + be a
pseudoinverse of V . Then

                                       X̂ = V + B(Y )Y                               (13)

Combining the two steps, the next approximation of X can be obtained by the
equation:

                                   X [t+1] = V + B(X [t] )X [t]                      (14)


4    Individual Differences Model

Let us, similarly, give a definition of individual differences model and prove
that it satisfies the axioms of distance parameterized by size. We also provide a
method of its adaptation.


                                                 179
4.1   Model Definition

Now let X be a set with defined distance dist such that its elements are vectors
of length L. Let S be a partially ordered set of sizes with an addition operation.
Let r : S → RL be a function from sizes to vectors of real numbers of length L,
and for all elements of X an operation of multiplication by real matrix ∈ RL×L
is defined. Then ∀x1 , x2 ∈ X, ∀s ∈ S consider a function
                                                                    
               ρ(x1 , x2 , s) = dist x1 × diag(r(s)), x2 × diag(r(s))         (15)

    One may state that if dist is a distance (a pseudometric), then for all r
function ρ satisfies the axioms of distance (pseudometric). Multiplication of a
vector by a diagonal matrix is equivalent to multiplication of its i-th component
by the i-th diagonal element ∀i ∈ [1, L]. Such a transformation converts different
elements of X to equal (with zero components corresponding to ri (s) = 0) or
different elements.
    If dist is a metric, then if

                            ri (s) ̸= 0 ∀s ∈ S, ∀i ∈ [1, L],

then ρ satisfies the metric axioms. Multiplication of each component of a vector
by a non-zero scalar converts different vectors to different ones.
   The constraints that should be imposed on a function r to ensure the fulfill-
ment of size axioms depend on a function dist. Let us consider a special case.

Theorem 2. If in individual differences model dist = eucl, and r is non-negative,
monotonous and subadditive, i.e it satisfies ∀s1 , s2 ∈ S

                                        r(s1 ) ≥ 0                                      (16)

                               s1 ≤ s2 ⇒ r(s1 ) ≤ r(s2 )                                (17)

                              r(s1 + s2 ) ≤ r(s1 ) + r(s2 )                             (18)
then ρ satisfies the size axioms (S1) − (S2).

Proof. Let us show the fulfillment of the monotonicity axiom (S1). Let x1 , x2 ∈
X, s1 , s2 ∈ S be arbitrary elements. It is true that

         s1 ≤ s2 ⇒ {(17)} ⇒ r(s1 ) ≤ r(s2 ) ⇒ {(16)} ⇒ r2 (s1 ) ≤ r2 (s2 )              (19)

Since the function eucl is non-negative, it is also true that
                                                                    
  ρ(x1 , x2 , s1 ) ≤ ρ(x1 , x2 , s2 ) ⇔ eucl x1 × r(s1 ), x2 × r(s1 ) ≤
                                             
            ≤ eucl x1 × r(s2 ), x2 × r(s2 ) ⇔ ρ2 (x1 , x2 , s1 ) ≤ ρ2 (x1 , x2 , s2 )   (20)


                                           180
Then ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S, s1 ≤ s2
                                                  L                                   L
                                                  X                          2       X                          2
  ρ2 (x1 , x2 , s1 )−ρ2 (x1 , x2 , s2 ) =               rl2 (s1 ) x1l −x2l        −         rl2 (s2 ) x1l −x2l        =
                                                  l=1                                 l=1
                                              L
                                              X                                    2
                                                    rl2 (s1 ) − rl2 (s2 ) x1l − x2l ≤ {(19)} ≤ 0
                                                                         
                                          =
                                              l=1

And according to (20) it is true that
          ρ2 (x1 , x2 , s1 ) − ρ2 (x1 , x2 , s2 ) ≤ 0 ⇔ ρ(x1 , x2 , s1 ) − ρ(x1 , x2 , s2 ) ≤ 0
Now let us show the fulfillment of the axiom (S2). ∀x1 , x2 ∈ X, ∀s1 , s2 ∈ S
                                 L
      2
                                 X                                    2
  ρ (x1 , x2 , s1 + s2 ) =             rl2 (s1 + s2 ) x1l − x2l            ≤ {(16), (18)} ≤
                                 l=1
          L                                       L
          X                          2        2 X                                2
                                                    rl2 (s1 ) + rl2 (s2 ) x1l − x2l +
                                                                         
      ≤           rl (s1 ) + rl (s2 ) x1l − x2l =
          l=1                                                  l=1
          L                                                      L
          X                                  2                  X                                         2
                                                                           rl2 (s1 ) + rl2 (s2 ) x1l − x2l
                                                                                                
 +2             rl (s1 )rl (s2 ) x1l − x2l        ≤ {(16)} ≤
          l=1                                                    l=1

Since both expressions are non-negative, it is true that
                                 L
                                 X                                      2
  ρ2 (x1 , x2 , s1 + s2 ) ≤              rl2 (s1 ) + rl2 (s2 ) x1l − x2l ⇔
                                                              

                                 l=1
                                                       v
                                                       u L
                                                       uX                                 2
                             ⇔ ρ(x1 , x2 , s1 + s2 ) ≤ t   rl2 (s1 ) + rl2 (s2 ) x1l − x2l
                                                                 l=1

Since the square root of a sum of non-negative elements does not exceed the sum
of roots of these elements, then
                           v
                           u L
                           uX                                2
   ρ(x1 , x2 , s1 + s2 ) ≤ t   rl2 (s1 ) + rl2 (s2 ) x1l − x2l ≤
                                   l=1
                     v                        v
                     u L                      u L
                                            2                        2
                     uX                      uX
                   ≤ t    2
                         rl (s1 ) x1l − x2l + t   rl2 (s2 ) x1l − x2l =
                          l=1                                   l=1

                                                                       = ρ(x1 , x2 , s1 ) + ρ(x1 , x2 , s2 )

4.2       Adaptation Method
The individual differences model with dist = eucl is considered. Let us look for
the solution in the following form: let us build a uniform “group” configuration


                                                         181
X = [x1 , . . . , xN ]T in a space of a given dimension L, and a set of weight vectors
r = [r1 , . . . , rK ]T ∈ RK×L . In order to ensure the fulfillment of the axioms of
distance parameterized by size, the vectors r can be required to satisfy the suffi-
cient conditions of the theorem 2. That is, for the case of sizes being proportional
to indices of the input tensor, it is suffice to require

                                     rk ≥ 0 ∀k ∈ [1, K]                             (21)

                              rk2 − rk1 ≥ 0 ∀k1 ≤ k2 ∈ [1, K]                       (22)
               rk1 + rk2 − rk1 +k2 ≥ 0 ∀k1 , k2 ∈ [1, K], k1 + k2 ≤ K               (23)
Arithmetic operations and comparisons are element-wise.
    We will consider the optimal solution the one that minimizes the stress func-
tion
               K X
               X N X
                   N                                                          2
                                                                       
  S(X, r) =                    wijk eucl xi × diag(rk ), xj × diag(rk ) − δijk      (24)
               k=1 i=1 j=1

and satisfies the conditions (21) — (23).
    We will optimize using a modified generalization of SMACOF method for
weighted euclidean model [3].
    Let us denote Xk = X × diag(rk ). The stress function can be rewritten as
follows:
                        K
                        X
                              const(X, r) + tr(XkT Vk Xk ) − 2 tr(XkT B(Xk )Xk )
                                                                                
        S(X, r) =
                        k=1

   Let Y = {Y1 , . . . , YK }. A variational upper bound for the stress function is
                     K
                     X
                              const(X, r) + tr(XkT Vk Xk ) − 2 tr(XkT B(Yk )Yk )
                                                                                
     T (X, Y, r) =                                                                  (25)
                     k=1

Let X k = Vk+ B(Yk )Yk . Then

                  K
                  X                                                      T
                          const(X, r)+tr (Xk −X k )T Vk (Xk −X k ) −tr X k Vk X k
                                                                                 
  T (X, Y, r) =
                  k=1

   We will minimize the variational lower bound, similarly to the method of
proportional distances model adaptation, in two steps. Since

               T (X, {X × diag(r1 ), . . . , X × diag(rK )}, r) = S(X, r),
       [t+1]      [t]
then Yk     = Xk .
   For any fixed Y only the second term depends on X, r, and the equation is
quadratic with respect to these variables. The function can be minimized with
respect to one group of parameters for fixed values of others. Minimization with


                                             182
respect to X can be performed by the least squares method, and with respect to
r by one of the methods of optimization of quadratic equation with linear con-
straints (e.g. by the active set method). It should be noted that the optimization
with respect to r can be performed independently for each dimension.
    Combining the two steps, we get
                                                    [t]   [t]
                                    X k = Vk+ B(Xk )Xk                               (26)

                              K
                              X
 X [t+1] , r[t+1] = arg min         tr (X×diag(rk )−X k )T Vk (X×diag(rk )−X k ) (27)
                                                                                
                     X,r
                              k=1


5   Experiments

The experiments were carried out on data on message passing delays in the
Lomonosov supercomputer system. Note that the methods of collecting this in-
formation themselves require significant scientific and technological efforts. One
can read about the methods of collecting and primary processing of the men-
tioned data in the papers [5, 6, 9].
    In parallel programming using the MPI standard, the program is divided
into processes that can interact with each other by exchanging messages. The
information about message passing delays is useful to collect and model, as it
can help to improve the efficiency of the computing system, in particular, to
solve the problems of dynamic scheduling of programs execution, as well as to
diagnose the communication environment.
    In our experiment the data was collected for 78 processes and message lengths
in range from 0 to 10000 in increments of 100 bytes. For every ordered pair of
processes and every message length several measurements of delays were taken,
after which the median of the obtained empirical distribution was calculated. As
a result, a tensor ∆ ∈ R78×78×100 of dissimilarities parameterized by size was
obtained.
    To assess how real data correspond to the proposed axiomatic model, for
each axiom, all pairs (triples) of values were selected from the input tensor, for
which it is correct to check its fulfillment. Then the fraction of pairs (triplets),
for which this axiom is incorrect, was calculated. The obtained values are shown
in table 1. One can see that the axioms are violated only for an insignificant part
of the data, which indicates that the proposed concept quite well describes real
systems.


               Table 1. Fraction of data that does not satisfy the axioms

     axiom            D1               D2            D3          S1           S2
    fraction          0               0.181          0          0.008       0.0005


                                              183
                                                                                                      104
                                                            PD model constrained
                                                            PD model unconstrained
                                                            ID model constrained                      103
                                                            ID model unconstrained
normalized stress S*

                       10 1                                                                           102


                                                                                           time, s
                                                                                                      101

                                                                                                      100                                 PD model constrained
                                                                                                                                          PD model unconstrained
                                                                                                                                          ID model constrained
                       10 2                                                                          10 1                                 ID model unconstrained
                              1 2 3 4 5 6 7 8 9 10               15            20                           1 2 3 4 5 6 7 8 9 10               15            20
                                              parameter L                                                                   parameter L


Fig. 1. Normalized stress S ∗ value de-                                                    Fig. 2. Models tuning time depending on
pending on parameter L                                                                     parameter L value


   During further experiments, the proportional distances (PD) model and in-
dividual differences (ID) model were adapted to data for different values of L
— the solution space dimension. For comparison, these models were also tuned
without regard to parameters restrictions. Under this condition they equal to the
identity model and the weighted euclidean model from [1], and their adaptation
methods coincide with SMACOF generalizations for these models.
                         Figure 1 shows the final values of normalized stress function


                                                                      S(X, r)
                                                      S ∗ (X, r) = PK PN PN                                                                                 (28)
                                                                                                                   2
                                                                                     k=1             i=1      j=1 δijk


for PD and ID models. The stress values for the conditional and unconditional
optimization of the models are indistinguishable, thus they look like 2 lines.
Figure 2 contains the plots of the dependence of models tuning time on the
values of L.
    It can be noticed that for the adapted models the normalized stress func-
tion takes small (about 10−2 ) values, which means that the real system is close
to the “correct” one. This is also evidenced by the similarity of stress values
for conditional and unconditional models optimization: the requirements for the
parameters do not impose significant restrictions on the quality of model adap-
tation. Small values of stress are achieved for rather low values of L, that is, the
input tensor allows efficient compression by the proposed models.
    The values of stress function for two models substantially differ only for large
values of L, and the use of individual differences model can be redundant for
our data. While the number of parameters and tuning time for this model is
significantly larger.


                                                                                     184
                                                        PD model                                                                             0.05      PD model

fraction of non-fulfillment of monotonicity


                                                                                                   fraction of non-fulfillment of S1 axiom
                                                        ID model                                                                                       ID model
                                              0.40
                                                                                                                                             0.04
                                              0.35
               property for r

                                              0.30                                                                                           0.03
                                              0.25
                                                                                                                                             0.02
                                              0.20

                                              0.15                                                                                           0.01

                                                     1 2 3 4 5 6 7 8 9 10          15   20                                                          1 2 3 4 5 6 7 8 9 10          15   20
                                                                     parameter L                                                                                    parameter L


Fig. 3. Fraction of non-fulfillment of                                                             Fig. 4. Fraction on non-fulfillment of S1
monotonicity property for parameter r                                                              axiom depending on the value of parame-
depending on parameter L value for un-                                                             ter L value for unconstrained models
constrained models


    The results of constrained models adaptation satisfy all the axioms of dis-
tance parameterized by size. For the results of unconstrained adaptation, the
fulfillment of the distance axioms is guaranteed for any parameter values. For
the axioms of size, we measured the fraction of the resulting tensor for which
these axioms are not fulfilled. The fraction of the vector (matrix) of parame-
ters r that does not satisfy the conditions of non-negativity, monotonicity and
subadditivity was also measured. Non-negative values were obtained only for
monotonicity constraint and S1 axiom, the plots of dependence on L are shown
in figures 3 and 4.

6                                                Conclusion
We have formally introduced the concept of ’distances parameterized by size’
and proposed several specific models of such distances. For each model we have
found the sufficient conditions on the parameters that guarantee the fulfillment
of all the axioms from the definition, and have provided a specific method of
model adaptation. We have demonstrated empirically that the proposed models
allow to approximate real data with a high degree of compression.

Acknowledgements. The study was partially supported by RFBR (project
no. 20-01-00664-a) and state-financed research work no. 5.1.21 of Lomonosov
Moscow State University.

References
1. Borg, I., Groenen, P.J.F.: Modern multidimensional scaling: Theory and applica-
   tions. Springer Science & Business Media (2005)
2. Chung, F., Garrett, M., Graham, R., Shallcross, D.: Distance realization problems
   with applications to internet tomography. Journal of Computer and System Sciences
   63(3), 432–448 (2001)


                                                                                             185
3. Cox, T.F., Cox, M.A.: Multidimensional Scaling. New York: Chapman and
   Hall/CRC (2000)
4. Davison, M.L.: Multidimensional scaling: methods of data visualization. (in Rus-
   sian). Finance and Statistics (1988)
5. Kozlov, V., Maysuradze, A.: Separation of a mixture of three-parameter lognor-
   mal distributions in the analysis of communication environments. Computational
   Mathematics and Modeling 30(3), 311–319 (2019)
6. Maysuradze, A., Kozlov, V.: Modeling message passing delays in a computer cluster
   to monitor its network. In: CEUR Workshop Proceedings. pp. 93–99 (2017)
7. Maysuradze, A.I.: Homogeneous and rank bases in spaces of metric configurations
   (in russian). Journal of Computational Mathematics and Mathematical Physics
   46(2), 344–360 (2006)
8. Nocedal, J., Wright, S.: Numerical optimization. Springer Science & Business Media
   (2006)
9. Salnikov, A., Begaev, A., Maysuradze, A.: Similarity mining of message passing
   delays in supercomputer networks based on peak and step detection. In: Russian
   Supercomputing Days. pp. 486–499. Springer (2020)


                                        186