=Paper= {{Paper |id=Vol-1990/paper-01 |storemode=property |title=Parallel Multi-Objective Optimization Method for Finding Complete Set of Weakly Efficient Solutions |pdfUrl=https://ceur-ws.org/Vol-1990/paper-01.pdf |volume=Vol-1990 |authors=Vladislav Sovrasov }} ==Parallel Multi-Objective Optimization Method for Finding Complete Set of Weakly Efficient Solutions== https://ceur-ws.org/Vol-1990/paper-01.pdf
    Parallel Multi-Objective Optimization Method
     for Finding Complete Set of Weakly Efficient
                      Solutions

                                 Vladislav Sovrasov

     Lobachevsky State University of Nizhni Novgorod, Nizhni Novgorod, Russia
                           sovrasov.vlad@gmail.com



       Abstract. This work considers a parallel algorithm for the multi-objective
       optimization. The algorithm is based on the application of the information-
       statistical approach for solving reduced scalar problem in which the set
       of the global optima coincides to the set of the weakly efficient solutions
       of the initial multi-objective problem. In the work the parallelization
       scheme, which is common for all information-statistical global search al-
       gorithms, was applied to the sequential algorithm of the multi-objective
       optimization. Also, in this work one of the convergence speedup tech-
       niques based on local properties of objective functions has been applied
       to the multi-objective problems.

       Keywords: deterministic global optimization · multi-objective optimiza-
       tion · parallel numerical methods · derivative-free algorithms


1    Introduction

The multi-objective optimization problems attract an increasing interest in re-
cent years since these ones most completely reflect the controversy of the re-
quirements arising to the modeled objects of real world. In such problems, the
solutions are a compromised decision of a set of these ones. As a rule, these com-
promised solutions are chosen either from a set, in which all elements cannot be
improved with respect to any criterion without the worsening of the other crite-
ria (Pareto-optimal or efficient solutions) or from a set, in which no one element
could be improved with respect to all criteria at once (Slater-optimal or weakly
efficient solutions). The latter requirement is weaker, and all Pareto-optimal
solutions are the Slater-optimal ones as well.
     To find the complete set of efficient or weakly efficient points is the most
computationally complex problem, however it provides a wide choice of possi-
ble compromised solutions. To find these sets a plenty of approaches have been
developed: the application of the genetic algorithms to the multi-objective prob-
lem directly [4], the use of the parametric convolutions reducing the problem
to a series of the scalar global optimization problems [5], various approaches
to the solving the problems with the convex [11] or linear [3] objectives. The
main problem in the using of the most of these methods is the absence of the
2      Vladislav Sovrasov

theoretical guarantee of the uniform convergence to the whole sought set of the
Pareto-optimal solutions on a wide class of problems including the ones with the
non-convex multiextremal criteria. A method having this property in the case,
when all criteria satisfy the Lipschitz condition has been developed on the basis
of the information global search algorithm [8]. In the paper there is considered
the parallel modification of this method accelerated due to the local refinement
technique [10](chapter 3).


2   Problem Statement and Dimensionality Reduction
The multi-objective optimization problems are stated in the following way:

           min{f (y) : y ∈ D}, D = {y ∈ Rn : ai 6 yi 6 bi , 1 6 i 6 n}.            (1)

Assume that the components of the vector function (the partial criteria) fi (y), 1 6
i 6 m satisfy the Lipschitz condition with the Lipschitz constant Li in D:

       |fi (y1 ) − fi (y2 )| 6 Li ky1 − y2 k, y1 , y2 ∈ D, 0 < Li < ∞, 1 6 i 6 m

    The set S(D) ∈ D of strictly non-dominated points from the search domain
is usually accepted as the solution of the problem (1), i.e.

              S(D) = {y ∈ D : @z ∈ D, fi (z) < fi (y), 1 6 i 6 m},                 (2)
which is called the set of semi-efficient (or weakly efficient) solutions. The con-
ditions in the right-hand side of the definition (2) are known as the principle of
weak Pareto-optimality (or Slater’s optimality principle).
    The use of the space-filling curve (evolvent) y(x)

         {y ∈ RN : −2−1 6 yi 6 2−1 , 1 6 i 6 N } = {y(x) : 0 6 x 6 1}

is a classic dimensionality reduction scheme for global optimization algorithms [9].
Such a mapping allows to reduce the problem (1) stated in a multidimensional
space to solving a one-dimensional problem at the expense of worsening its prop-
erties. In particular, the one-dimensional functions fi (y(x)) are not Lipschitzian
but the Hölderian functions:
                                                          1
             |fi (y(x1 )) − fi (y(x2 ))| 6 Hi |x1 − x2 | N , x1 , x2 ∈ [0, 1],     (3)

where the Hölder constants Hi are related to the Lipschitz constant Li by the
relation                   √
                 Hi = 4Li d N , d = max{bi − ai : 1 6 i 6 n}.
   Without losing generality one can, one can consider the solving of the one-
dimensional problem min{f (y(x)) : x ∈ [0, 1]}, satisfying the Hölder condition.
The issues of the numerical building the mapping like a Peano curve and the
corresponding theory have been considered in detail in [9]. Here we would note
that an evolvent built numerically approximates the theoretical Peano curve with
a precision of the order 2−m , where m is the building parameter of the evolvent.
                                             Parallel Multi-Objective Optimization             3

3    Description of the Parallel Algorithm With Local
     Refinement

Let us consider a scalarization scheme for the reduced problem (1), presented
in [8]. Let

                      ϕ(x) = max{h(x, y) : y ∈ [0, 1]}, x ∈ [0, 1],
                       h(x, y) = min{fi (x) − fi (y) : 1 6 i 6 m}.

Let us consider a scalar problem

                               ϕ∗ = min{ϕ(x) : x ∈ [0, 1]}.                                  (4)

    As it has been shown in [8], the set of weakly efficient solutions of the reduced
problem (1) coincides to the set of the globally optimal solutions of the problem
(4), i.e.
                       S([0, 1]) = {x ∈ [0, 1] : ϕ(x) = ϕ∗ }.                     (5)
Also, it has been shown in [8] that the function ϕ(x) satisfies the Hölder condition
when the requirements (3) are satisfied. Thus, the information-statistical global
search algorithm can be applied to the function ϕ(x) in order to solve the problem
(4). However, ϕ(x) is defined through the operator max{...}, therefore, it is
difficult to compute it directly. In [8], a modification of the original information
global search algorithm [7] is given, in which the values of ϕ(x) are computed
approximately. Let us consider a new version of this algorithm. The modification
consists in the use of the local refinement technique described in [10](chapter 3)
as well as in the parallelization based on the characteristics [10](chapter 5).
    The algorithm for solving problem (1) involves constructing a sequence of
points xi , where the values of the objective functions are calculated. Let us
call the functions values calculation process (including construction of image
y i = y(xi )) the trial, and the tuple {xi , y i , f1 (y i ), ...fm (y i )}, the trial result. At
each iteration p trials is carried out in parallel, and the set of tuples

                       {xi , y i , f1 (y i ), ...fm (y i )}, 1 ≤ i ≤ k = np,

makes up the search data collected using the method after carrying out n steps.
The rules that define the work of a parallel multi-objective algorithm are as
follows.
    The first two iterations are performed at the boundary points x0 = 0 and
x = 1 of the interval [0, 1]. The choice of the points xk+j , 1 6 j 6 p, is performed
  1

according to the following rules:
    Step 1. Renumber the points in the set Xk = {x1 , . . . , xk } ∪ {0} ∪ {1}, which
includes the boundary points of the interval [0, 1] as well as the points of pre-
ceding trials, by the lower indices in order of increasing coordinate values, i.e.

                              0 = x0 < x1 < . . . < xk+1 = 1.
4       Vladislav Sovrasov

   Step 2. Compute the lower bound µν for the Lipschitz constant for each
objective function fν (x), 1 6 ν 6 m:

                               |fν (xi ) − fν (xi−1 )|
                   µν = max                            , 1 6 ν 6 m,                 (6)
                         16i6k           ∆i
                                                       1
                                    ∆i = (xi − xi−1 ) N .                           (7)

    Step 3. To each point xi , 0 6 i 6 k, juxtapose the value

                         zi = max{h(xi , xj ) : 0 6 j 6 k},                         (8)

where
                                                                
                                  fν (xi ) − fν (xj )
          h(xi , xj ) = min                           : 1 6 ν 6 m , 0 6 i, j 6 k    (9)
                                          µν

    Step 4. For each interval (xi , xi−1 ), 1 6 i 6 k, compute two characteristics:

                                  (zi − zi−1 )2    zi + zi−1
                       R(i) = ∆i +     2
                                                −            ,                     (10)
                                      r ∆i             2r
                                         R(i)
                    R∗ (i) = p                                ,                    (11)
                              (zi − z )(zi−1 − z ∗ ) + 1.5−α
                                     ∗


where ∆i is from (7), z ∗ = min{zi : 1 6 i 6 k}; r > 1 and α ∈ [10, 30] are the
input parameters of the algorithm.
   Step 5. If q 6= 0 and s mod q 6= 0, then arrange the characteristics R(i),
1 6 i 6 k + 1 in the decreasing order

                     R(t1 ) > R(t2 ) > · · · > R(tk ) > R(tk+1 )

and select p maximum characteristics with the indices of the intervals tj , 1 6
j 6 p. Otherwise do the same with the characteristics R∗ (i), 1 6 i 6 k + 1. Here
s is the index of current iteration and integer q is the parameter of the method
responsible for the degree of intensity of the local refinement. The less q, the
more frequent the characteristics R∗ are used making the method to choose the
next points near the current minimum found.
    Step 6. Perform the new trials at the points xk+j , 1 6 j 6 p:

                       xtj + xtj −1                      |zt − ztj −1 |n
              xk+j =                − sign(ztj − ztj −1 ) j                        (12)
                            2                                 2r
All p trials within this step can be performed in parallel on p computing devices.
    The algorithm is terminated if the condition ∆tj 6 ε is fulfilled at least for
one of the indices tj , 1 6 j 6 p; here ε > 0 is the predefined accuracy. After the
search is terminated, the set S({x0 , . . . , xk }) of all non-dominated points of the
truncated sequence {x0 , . . . , xk } is accepted as an estimation for S from (5).
    The theoretical substantiation of this method when p = 1 and q = 0 is
presented in [10](chapter 3).
                                          Parallel Multi-Objective Optimization        5

4   Test Problems
To evaluate the degree of speedup of the convergence of the modified algorithm
from Section 3, the following problems were used:
 1. Markin-Strongin problem [8]:
                   p          p
               min{ y12 + y22 , (y1 − 1.5)2 + (y2 + 1.5)2 }
      f1 (y) = p
                                                            y1 ∈ [−1; 2], y2 ∈ [−2; 1]
      f2 (y) = (y1 + 0.5)2 + (y2 − 0.5)2
                                                                               (13)
 2. Fonseca and Fleming problem [6]:
                                                  2 
                                    Pn 
               f1 (y) = 1 − exp − i=1 yi − √1n
              
              
                                                                       n
                                 
                                    Pn             2  y ∈ [−4; 4]           (14)
                                                  1
               f2 (y) = 1 − exp − i=1 yi + √n
              
              

 3. Viennet function [6]:
             
                               2      2          2     2
              f1 (y) = 0.5(y1 + y22) + sin(y1 + y22 )
             
               f2 (y) = (3y1 −2y8
                                  2 +4)
                                        + (y1 −y272 +1) + 15        y ∈ [−3; 3]2     (15)
              f3 (y) = 2 1 2 − 1.1 exp{−(y 2 + y 2 )}
             
                          y +y +1
                              1   2
                                                     1   2

 4. Poloni’s two objective function [6]:
        (         h                                    i
                                      2              2
          f1 (y) = 1 + (A1 − B1 (y)) + (A2 − B2 (y))
                              2             2
                                                                      y ∈ [−π; π]2   (16)
            f2 (y) = (y1 + 3) + (y2 + 1)
    where
              
              
               A1 = 0.5 sin (1) − 2 cos (1) + sin (2) − 1.5 cos (2)
              
              A = 1.5 sin (1) − cos (1) + 2 sin (2) − 0.5 cos (2)
                 2
              
              
               B 1 (y) = 0.5 sin (y1 ) − 2 cos (y1 ) + sin (y2 ) − 1.5 cos (y2 )
                B2 (y) = 1.5 sin (y1 ) − cos (y1 ) + 2 sin (y2 ) − 0.5 cos (y2 )
              


5   Experimental Results
The computational experiments have been carried out on the Lobachevsky su-
percomputer at State University of Nizhny Novgorod. A computational node
included 2 Intel Sandy Bridge E5-2660 2.2 GHz processors, 64 GB RAM. The
CPUs had 8 cores (i.e. total 16 cores were available per a node). C++ lan-
guage and OpenMP parallel framework were used to implement the considered
algorithm.
    In this section, we will understand the speedup of the method as the speedup
in the number of executed iterations, not in the time of execution. If the problem
criteria (1) is time-consuming, the overhead costs for the execution of the decision
rules of the optimization method are low as compared to the time of computing
the criteria fi (y), 1 6 i 6 m.
6             Vladislav Sovrasov

Local refinement advantages. In [2] the numerical experiments are presented
which demonstrate the speedup of convergence of the scalar global optimiza-
tion algorithm using the local refinement techniques described above. One could
expect similar results from the multi-objective algorithm as well. The multi-
objective algorithm with local refinement (MOAR) has been applied to the Fon-
seca and Fleming problem (14) at n = 2. The parameters of the method were
as follows: ε = 0.01, r = 4, q = 4, α = 15, p = 1. Before the method stops,
1176 iterations have been executed, the number of found weakly optimal points
was 90. At q = 0 (without the local refinement) the method had performed 1484
iterations, the number of found weakly optimal points was 93. In Fig. 1 the set
S in the problem (14) and the numerical estimates of this one obtained at q = 0
(Fig. 1.a) and q = 4 (Fig. 1.b) are presented. It is evident also from Fig. 1 that
the approximate solution covers the whole set of the weak-optimal solutions.



     1.0                                                   1.0

     0.8                                                   0.8

     0.6                                                   0.6
f2




                                                      f2




     0.4                                                   0.4

     0.2                                                   0.2
                Numerical solution                                  Numerical solution
                Slater set                                          Slater set
     0.00.0     0.2      0.4        0.6   0.8   1.0        0.00.0   0.2     0.4        0.6   0.8   1.0
                               f1                                                 f1
                (a) MOAR with q = 0                                 (b) MOAR with q = 4

                      Fig. 1: Numericas estimation of S obtained by MOAR


     Below, MOAR with q = 4 was used for all experiments.

Parallel method results. In order to demonstrate the speedup in iterations, which
MOAR provides when p > 1, all problems from Section 4 have been solved for
the values p = 1, 2, 4, 8, 16. The parameters of the method were the following:
r = 4.5, ε = 0.01, α = 15. The number of iterations is presented in Table 1 (in
the braces, the cardinality of the set S is given) and the speedup in iterations
is presented in in Table 2. As it is evident form Table 1, the cardinality of the
weak-optimal solutions set changed insufficiently when varying p i. e. the quality
of the obtained estimates remained the same. At the same time, the number
of iterations decreased linearly with increasing value of p (Table 2). Thus, the
parallel algorithm reduces the number of iterations almost in p times comparing
to the sequantal one. In Fig. 2, the examples of the numerical solutions of the
considered problems are given.
                                     Parallel Multi-Objective Optimization                    7


        Table 1: Results of numerical experiments: number of iterations
Problem                                                     p
                           1             2                  4              8               16
Markin-Strongin        1041(198)      516(198)          256(185)       131(197)          68(191)
Fonseca and Fleming 2d 1181(93)       636(99)           386(111)        176(95)          106(97)
Fonseca and Fleming 3d 5346(160)     3551(183)          1186(143)      606(153)         351(142)
Viennet problem        4896(276)     2156(273)          1226(270)      631(287)         286(274)
Poloni’s function      3351(102)      1706(90)           856(88)        426(96)          201(99)



       Table 2: Results of numerical experiments: speedup in iterations
       Problem                                            p
                                     2            4               8            16
       Markin-Strongin             2.02          4.07           7.95          15.31
       Fonseca and Fleming 2d      1.86          3.06           6.71          11.14
       Fonseca and Fleming 3d      1.51          4.51           8.82          15.23
       Viennet problem             2.27          3.99           7.76          17.12
       Poloni’s problem            1.96          3.91           7.87          16.67




Parallel method time speedup. The objective functions of all problems listed in
this section are featured by low computational complexity. In order to demon-
strate the possibility to obtain a speedup in time for the problems with the
time-consuming criteria, additional intensive floating point computations not
affecting the resulting values were introduced into the criteria. The speedups
obtained are given in Table 3. In all cases, the speedup in time was less than the
speedup in iterations because the operation of the optimization method itself
takes a part of the execution time. However, in most cases the parallelization
was efficient enough even in the case of 16 CPU threads. For more computation
costly criteria, one can expect the increasing of the speedup in time for a large
number of threads.




          Table 3: Results of numerical experiments: speedup in time
  Problem                                                 p
                        1(time, s)          2             4              8              16
  Markin-Strongin         104.47          1.97           3.65          6.79            9.90
  Fonseca and Fleming 2d 118.95           1.85           2.81          5.79            6.40
  Fonseca and Fleming 3d 554.45           1.51           4.14          8.05           10.69
  Viennet problem         1488.6          2.22           3.64          6.98           13.49
  Poloni’s problem        336.74          1.82           3.64          6.98           10.60
8             Vladislav Sovrasov



     3.0                                                             1.0

     2.5                                                             0.8
     2.0
                                                                     0.6
     1.5
f2




                                                                f2
                                                                     0.4
     1.0

     0.5                                                             0.2

     0.00.0    0.2     0.4   0.6        0.8   1.0   1.2   1.4        0.00.0       0.2       0.4             0.6        0.8        1.0
                                   f1                                                                 f1
              (a) Markin-Strongin problem                            (b) Fonseca and Fleming 3d problem


                                                                     30

                                                                     25

                                                                     20

                                                                     15
                                                                f2



                                                                     10

                                                                       5

                                                                       00     2     4   6         8        10     12   14    16   18
                                                                                                      f1
                     (c) Viennet problem                                          (d) Poloni’s problem

                      Fig. 2: Numericas estimations of S obtained by MOAR



6       Conclusion

In the work a method for solving the multi-objective optimization problem has
been considered which allows to find a uniform estimate of the set of the weakly-
optimal points. The techniques of parallelization and of the acceleration of con-
vergence common for all algorithms of similar structure have been applied. The
numerical experiments performed have demonstrated a speedup in convergence
when using the local refinement. Also, the parallelization based on the charac-
teristics was found to be efficient for this method. The speedup in iterations at
the parallelization based on the characteristics appeared to be linear in most
cases as for the information methods of scalar optimization (see the results of
the parallelization of the scalar optimization method, for example, in [1]). From
the results of the experiments, one can conclude that it is expedient to apply
the parallel MOAR in the problems of low dimensionality (1 to 3) with the
computation-costly criteria. Moreover, the property of uniform convergence of
                                       Parallel Multi-Objective Optimization         9

the considered algorithm to the whole set of the weakly-optimal solutions has
been demonstrated on the numerical examples.


Acknowledgments. The study was supported by the Russian Science Foun-
dation, project № 16-11-10150.


References
 1. Barkalov, K.A., Lebedev, I.G., Sovrasov, V.V., Sysoyev, A.V.: Implementation of
    a parallel algorithm for searching the global extremum of a function on Intel Xeon
    Phi. In: CEUR Workshop Proceedings. vol. 1576, pp. 68–80 (2016)
 2. Barkalov, K., Lebedev, I.: Local tuning in multilevel scheme of parallel global
    optimization. AIP Conference Proceedings 1776(1), 060006 (2016)
 3. Benson, H.P.: An outer approximation algorithm for generating all efficient extreme
    points in the outcome set of a multiple objective linear programming problem.
    Journal of Global Optimization 13(1), 1–24 (1998)
 4. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjec-
    tive genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation
    6(2), 182–197 (2002)
 5. Gergel, V., Kozinov, E.: Accelerating parallel multicriterial optimization methods
    based on intensive using of search information. Procedia Computer Science 108,
    1463 – 1472 (2017), international Conference on Computational Science, {ICCS}
    2017, 12-14 June 2017, Zurich, Switzerland
 6. Huband, S., Hingston, P., Barone, L., While, L.: A Review of Multiobjective Test
    Problems and a Scalable Test Problem Toolkit. Trans. Evol. Comp 10(5), 477–506
    (Oct 2006)
 7. Markin, D.L., Strongin, R.G.: A method for solving multi-extremal problems with
    non-convex constraints, that uses a priori information about estimates of the opti-
    mum. Computational Mathematics and Mathematical Physics 27(1), 33–39 (1987)
 8. Markin, D.L., Strongin, R.G.: Uniform estimates for the set of weakly effective
    points in multi-extremum multicriterion optimization problems. Computational
    Mathematics and Mathematical Physics 33(2), 171–179 (1993)
 9. Sergeyev, Y.D., Strongin, R.G., Lera, D.: Introduction to Global Optimization
    Exploiting Space-Filling Curves. Springer Briefs in Optimization, Springer, New
    York (2013)
10. Strongin R.G., Sergeyev Ya.D.: Global optimization with non-convex constraints.
    Sequential and parallel algorithms. Kluwer Academic Publishers, Dordrecht (2000)
11. Zhang, G.L., Zuo, H.: Solution analysis of multi-objective programming problem.
    In: 2013 International Conference on Machine Learning and Cybernetics. vol. 03,
    pp. 1039–1044 (2013)