Algorithms for Multidimensional Frontier Visualization
      Based on Optimization Methods Using Distributed
                       Computations

      Alexander P. Afanasiev1,2,3,4             Vladimir E. Krivonozhko2,3,5               Andrey V. Lychev2
                                               Oleg V. Sukhoroslov1,4
              1
                  Centre for Distributed Computing, Institute for Information Transmission Problems,
                  Russian Academy of Sciences, Nakhimovsky Prospekt 36-1, Moscow 117218, Russia
       2
         National University of Science and Technology MISiS, Leninskiy Prospekt 4, Moscow 119049, Russia
               3
                 Lomonosov Moscow State University, GSP-1, Vorobievy Gory, Moscow 119991, Russia
    4
      National Research University Higher School of Economics, Myasnitskaya str., 20, Moscow 101000, Russia
             5
               Federal Research Center “Computer Science and Control”, Russian Academy of Sciences,
                                       Vavilov st. 44-2, Moscow 119333, Russia
                  E-mail: apa@iitp.ru, krivonozhkove@mail.ru, lychev@misis.ru, sukhoroslov@iitp.ru


                                                        Abstract
                       In data envelopment analysis, methods for constructing sections of the
                       frontier have been recently proposed to visualize the production pos-
                       sibility set. The aim of this paper is to develop, prove and test the
                       methods for the visualization of production possibility sets using dis-
                       tributed computations. In this article a general scheme of the algo-
                       rithm for constructing sections (visualization) of production possibility
                       set is proposed. An algorithm for constructing a generalized production
                       function is described in detail. Also, the possibilities of implementing
                       certain schemes in a distributed computing environment are considered.


1    Introduction
Despite the rapid development of information technologies in the last decade, relatively simple methods (based
on ratio analysis) are used for performance evaluation of large complex systems (regions, ﬁnancial organizations,
municipalities, universities, etc.). In this approach, for each production unit a set of ratios is calculated. Then,
some additive function, which is usually called the evaluation of complex objects, or simply rating function, is
constructed from these ratios. However, in the papers [Krivonozhko et al., 2011, Krivonozhko et al., 2008] it
was proved, that this approach signiﬁcantly distorts the evaluation of a complex object. This happens because
a multidimensional space is projected many times onto the subspaces during calculation of rating function. While
signiﬁcant distortions take place in every projection.
   Data envelopment analysis (DEA) originated as a generalization of simple eﬃciency ratios over multidimen-
sional case, i. e. when an activity of production units is described by a number of input and output indicators.
A set of homogeneous production units is considered in order to make the formulation of this problem correct
and intentional. As a result, this approach is reduced to the solution of a large family of optimization problems.

Copyright ⃝
          c by the paper’s authors. Copying permitted for private and academic purposes.
In: Yu. G. Evtushenko, M. Yu. Khachay, O. V. Khamisov, Yu. A. Kochetov, V.U. Malkova, M.A. Posypkin (eds.): Proceedings of
the OPTIMA-2017 Conference, Petrovac, Montenegro, 02-Oct-2017, published at http://ceur-ws.org


                                                             13
   The founders of the DEA approach were famous scientists A. Charnes, W. Cooper, E. Rhodes, R. Banker and
some others [Charnes et al., 1978, Banker et al., 1984]. At present the number of publications on this approach
makes up several thousand units in the international scientiﬁc journals [Emrouznejad & Yang, 2017].
   The ideas contained in the DEA approach have turned out to be much more seminal and far-reaching than
the simple computation of eﬃciency scores of complex systems. The DEA approach has close links with the
neoclassical theoretical economics, systems analysis, and multicriteria optimization. This approach allows one
to analyze the behavior of complex systems in the multidimensional space, to ﬁnd optimal paths of development
in it, to model various scenarios.
   However, the whole process of calculation in the DEA approach is hidden from the user. Every mathematical
model is just an approximation of the real-life processes and phenomena. For this reason some inadequacies may
arise in models. In the DEA scientiﬁc literature, some reports appeared that DEA results do not always coincide
with experts’ opinion. In our previous papers an approach for visualization of multidimensional production
possibility sets and investigation of the behavior of complex units is proposed. The visualization of production
possibility sets allows us to correct the DEA models using the frontier improving methods, reliably calculate
important indicators of complex units (scale elasticity, marginal rates, etc.). Moreover, the visualization can
reveal previously unexplored relationships between the variables in the model.
   Visualization methods have been currently applied to the models that have a convex production possibility set:
BCC (Banker, Charnes, Cooper) model with variable returns to scale, IRS (increasing returns to scale) model,
DRS (decreasing returns to scale) model, etc. For a wide class of production models with nonconvex production
possibility set, such methods and visualization algorithms are not developed. This paper aims to develop, prove
and test the methods for the visualization of production possibility sets using distributed computations.

2    Background
Consider a set of n observations of actual production units (Xj , Yj ), j = 1, . . . , n, where the vector of outputs
Yj = (y1j , . . . , yrj ) ≥ 0 is produced from the vector of inputs Xj = (x1j , . . . , xmj ) ≥ 0. The input-oriented
model [Banker et al., 1984] is written as follows

                                          min θ
                                          ∑n
                                             Xj λj + S − = θX0 ,
                                          j=1
                                          ∑n
                                                Yj λj − S + = Y0 ,                                                                 (1)
                                          j=1
                                          ∑n
                                              λj = 1, λj ≥ 0, j = 1, . . . , n,
                                          j=1
                                          s−
                                           k ≥ 0, k = 1, . . . , m, si ≥ 0, i = 1, . . . , r,
                                                                     +


where S − = (s−                    −         +      +         +                                                             ∗
                    1 , , . . . , sm ) and S = (s1 , . . . , sr ) are slack variables. In model (1) the optimal value θ describes
the eﬃciency score of unit (Xo , Yo ), where (Xo , Yo ) is a unit from the set of production units (Xj , Yj ), j = 1, . . . , n.
   Notice that we do not use an inﬁnitesimal constant explicitly in the DEA models, since we suppose that each
model is solved in two stages in order to separate eﬃcient and weakly eﬃcient units [Cooper et al., 2007].
   Definition 1. [Cooper et al., 2007] Unit (Xo , Yo ) ∈ T is called BCC-eﬃcient with respect to the input-oriented
BCC model if and only if any optimal solution of (1) satisﬁes: a) θ∗ = 1, b) all slacks s−                                          +
                                                                                                             k , k = 1, . . . , m, si ,
i = 1, . . . , r are zero.
   If condition (a) in Deﬁnition 1 is satisﬁed, then unit (Xo , Yo ) is called input weakly eﬃcient with respect to
the BCC model.
   Definition 2. [Cooper et al., 2007] Unit (X ′ , Y ′ ) ∈ T is Pareto eﬃcient if and only if there is no (X, Y ) ∈ T
and (X, Y ) ̸= (X ′ , Y ′ ) such that X ≤ X ′ and Y ≥ Y ′ .
   The production possibility set T for BCC model is formulated as follows
                            {            ∑n                     ∑n                  ∑n                                    }
               T BCC = (X, Y )              j=1 Xj , λj ≤ X,       j=1 Yj , λj ≥ Y,   j=1 λj = 1, λj ≥ 0, j = 1, . . . , n .       (2)

It was proved in [Krivonozhko et al., 2009] that the BCC model generalize a wide class of DEA models. Therefore,
in this paper, we dwell mainly on this model.


                                                                  14
3    Main Results
Deﬁne the intersection of the frontier with two-dimensional plane [Krivonozhko et al., 2004]
                                             {                                                  }
                    Sec(Xo , Yo , d1 , d2 ) = (X, Y ) (X, Y ) ∈ Pl(Xo , Yo , d1 , d2 ) ∩ WEﬀ P T ,                   (3)

where Pl(Xo , Yo , d1 , d2 ) is two-dimensional plane going through point (Xo , Yo ) and spanned by vectors d1 , d2 ∈
E m+r , WEﬀ P T is a set of weakly Pareto eﬃcient points of set T . It is proved in [Krivonozhko et al., 2005;
JORS] that WEﬀ P T coincides with a set of boundary points of T .
   Deﬁne three types of two-dimensional sections that we will use in this paper.
   1. Input isoquant, section S1 . In this case, we take the following direction vectors d1 = (ep , 0) ∈ E m+r ,
d2 = (es , 0) ∈ E m+r , where ep and es are m-identity vectors with a one in positions p and s, respectively.
   2. Output isoquant, section S2 . For this section the direction vectors are chosen as d1 = (0, ep ) ∈ E m+r ,
d2 = (0, es ) ∈ E m+r , ep and es are r-identity vectors with a one in positions p and s, respectively.
   3. Section S3 is a generalized production function for unit (Xo , Yo ). For this case we use the following directions:
d1 = (Xo , 0) ∈ E m+r , d2 = (0, Yo ) ∈ E m+r .
   Next, we describe the algorithm for constructing a generalized production function.
   Algorithm
   Step 1. Find a leftmost point on a curve.
   Let d1 = (Xo , 0) ∈ E m+r , d2 = (0, Yo ) ∈ E m+r ,
   a) solve the following optimization problem

                                                min β1
                                                                                                                     (4)
                                                (Zo + β1 d1 + τ d2 ) ∈ T,

where τ is a free variable.
  Set Z1 = Zo + β1∗ d1 + τ ∗ d2 , where β1∗ and τ ∗ are optimal variables in problem (4).
  b) Find a leftmost vertex on a curve. Solve the following optimization problem

                                                   max β2
                                                                                                                     (5)
                                                   (Z1 + β2 d2 ) ∈ T

Set Z11 = Z1 + β2∗ d2 , where β2∗ is the optimal value of the objective function in (5).
   Step 2. Find a topmost point on a curve. Solve two following optimization problems.
   a) Solve
                                               max β2
                                                                                                                     (6)
                                               (Zo + τ d1 + β2 d2 ) ∈ T
Set Z2 = Zo + τ ∗ d1 + β2∗ d2 , where β2∗ and τ ∗ are optimal variables in (6).
   b) Solve
                                                    min β1
                                                                                                                     (7)
                                                    (Z2 + β1 d1 ) ∈ T
Set Z21 = Z2 + β1∗ d1 , where β1∗ is the optimal value of the objective function in (7).
   Step 3. Set l = 1, k = 1, i1 = 1, i2 = 2. Create ﬂow Fkl , containing points Zil1 = Z11 , Zil2 = Z21 of production
possibility set T .
   Deﬁne set M = {Z11 , Z21 }.
   Step 4. While exist unprocessed ﬂows Fkl , perform the following computations.
   For each ﬂow solve optimization problem of the following type

                                                max β1
                                                                                                                     (8)
                                                (G + β1 d1 + τ d2 ) ∈ T,

where τ is a free variable,
                                                       1( l        )
                                                  G=     Zi1 + Zil2 ,
                                                       2
vector d1 is perpendicular to the vector d2 , it lies in the plane of the section, and is directed to the upper left
corner of a two-dimensional section, vector d2 = Zil2 − Zil1 .


                                                           15
  If optimal objective value of problem (8) β1∗ > 0, then start new ﬂows Fkl+1
                                                                            1
                                                                               and Fkl+1
                                                                                      2
                                                                                         to solve optimization
sub-problems.
  Flow Fkl+1
          1
              contains points

                                                           1 l
                                Zil+1 = Zil1 ,   Zil+1 =    (Z + Zil2 ) + β1∗ d1 + τ ∗ d2 ,                         (9)
                                   1                2
                                                           2 i1
where β1∗ and τ ∗ are optimal values of variables in problem (8).
  Flow Fkl+1
           2
              contains points
                                             Zil+1
                                                3
                                                   = Zil+1
                                                        2
                                                           ,     Zil+1
                                                                    4
                                                                       = Zil2 .                                    (10)
   Set l = l + 1.
   If optimal objective value in problem (8) β1∗ ≤ 0, then points Zil1 and Zil2 are angular adjacent points of
the segment of generalized production function. Add these points to the set of corner points of the production
function M . Flow Fkl is deleted from the list of ﬂow tasks.
   Step 5. Points of set M are angular points of generalized production function. Connect each adjacent pair
of points by a straight line segment. Finally, add a horizontal line after the last point and a vertical line starting
from the ﬁrst point downwards. This completes construction of the generalized production function.


        Figure 1: Graphical illustration of algorithm for construction of generalized production function

    Figure 1 illustrates the steps of the algorithm for constructing a section of the generalized production function.
At the ﬁrst steps, the extreme points Z11 and Z21 are determined, then the ﬂow F11 containing these two points
is started. At the next steps, as a result of solving problem (8), point Z22 will be found; and the ﬂow F11 will be
split into two ﬂows F12 and F22 that will have vertices Z11 and Z22 for ﬂow F12 , and vertices Z22 and Z21 for ﬂow F22 .
    Then the calculations are repeated until all segments of the generalized production function are found.
    For the algorithm described above the following proposition holds.
    Proposition. Algorithm constructs section S3 for the BCC model in a finite number of steps.
    In our computational experiments we used the software FrontierVision. This program allows us to visualize
the multidimensional production possibility set by means of constructing two- and three-dimensional sections
of the frontier. This software was speciﬁcally developed by authors for DEA models. The point is that the
present-day optimization software are not suitable for the DEA models due to some speciﬁc features of these
models. The reason of this is explained in detail in paper [Krivonozhko et al., 2004]. The FrontierVision is based
on the algorithms developed by authors [Krivonozhko et al., 2004, Volodin et al., 2004, Krivonozhko et al., 2005;
ITCS].

4   Implementation of the Algorithm in a Distributed Computing Environment
The described algorithm for constructing sections reduces the solution of the original problem to the solution
of recursively generated sub-problems, similarly to the known “divide and conquer” scheme [Dasgupta, 2006].
Since the sub-problems available at each moment can be solved independently, it is possible to speed up the
algorithm by parallel execution of these problems. In the case of a large amount of computations required, it is
promising to use the resources of many computers with the help of distributed computing technologies.


                                                            16
   This divide and conquer technique is the basis of many eﬃcient algorithms in optimization theory. For
example, Branch-and-Bound (B&B) algorithm use this scheme. The implementation of B&B method on desktop
grid systems is considered in paper [Posypkin et al., 2017]. This distributed B&B implementation relies on
BNB-Solver [Evtushenko et al., 2009]. A tool for simulating parallel B&B method and diﬀerent load balancing
strategies are considered in [Golubeva et al., 2016].
   Consider the general scheme of algorithm implementation in the distributed computing environment. The
implementation of the algorithm consists of two parts: the control part and the working part. The ﬁrst part
controls the calculation process, implementing the entire internal logic of the algorithm, except the solution of
ﬂow problems. The problems generated at the iterations of the algorithm are sent to solve the working part.
After this the results are returned back to the algorithm and then combined to give a solution to the original
problem. The working part implements the solution of ﬂow problems. While the control part is a single process,
the working part can consist of a many workﬂows running on various distributed computational resources.
   An important part of the implementation of the control process is the strategy of distributing ﬂow tasks to
workers. This strategy largely determines the eﬃciency of the entire algorithm in a distributed environment. The
strategy should minimize overhead, for example, not send workers small tasks, which can be solved locally more
quickly. It is also necessary to ensure a balanced load of work processes, for example, based on an assessment of
the complexity of tasks or the dynamic distribution of tasks.
   To implement the algorithm for constructing sections in a distributed computing environment, it is planned to
use the Everest platform [10]. Everest is a cloud-based software platform that supports the publication, execution,
and composition of computing applications in a distributed environment. The advantage of using the Everest
platform to implement the described algorithm is the availability of tools for creating distributed computing
applications and integration with computing resources that do not require additional platform installation.
In particular, the platform supports the implementation of multi-task applications that dynamically generate
computational tasks that are processed on remote resources using the platform.

5   Conclusions
In this article one algorithm for constructing section, a generalized production function, is described in detail.
The algorithms for constructing other sections diﬀer mainly in the way the direction vectors d1 and d2 are
selected at each iteration of the algorithm. Computational experiments using real-life datasets conﬁrmed that
the algorithm works reliably and construct sections correctly.
   The general scheme of the algorithm for constructing sections (visualization) of a set T is described. Also,
the possibilities of implementing certain schemes in a distributed computing environment are considered.

Acknowledgements
This work was supported by the Russian Science Foundation (project No. 17-11-01353).

References
[Krivonozhko et al., 2011] Krivonozhko, V. E., Piskunov, A. A., & Lychev, A. V. (2011). On comparison of ratio
         analysis and data envelopment analysis as performance assessment tools. IMA Journal of Management
         Mathematics 22(4), 357-370.

[Krivonozhko et al., 2008] Krivonozhko, V. E., Utkin, O. B., Saﬁn, M. M., & Lychev, A. V. (2008). Transforma-
         tion of the frontier in the eﬃciency analysis of complex systems. In S. V. Emel’yanov, & S. K. Korovin
         (Eds.). Nonlinear dynamics and control. Vol. 6. (pp. 279-304). Moscow: Fizmatlit. (in Russian)

[Charnes et al., 1978] Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring of eﬃciency of decision
         making units. European Journal of Operational Research 2(6), 429-444.

[Banker et al., 1984] Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical
         and scale ineﬃciency in data envelopment analysis. Management Science 30(9), 1078-1092.

[Emrouznejad & Yang, 2017] Emrouznejad, A., & Yang, G. (2017). A survey and analysis of the ﬁrst 40
        years of scholarly literature in DEA: 1978-2016. Socio-Economic Planning Sciences. (in press) DOI:
        10.1016/j.seps.2017.01.008


                                                        17
[Cooper et al., 2007] Cooper, W. W., Seiford, L. M., & Tone, K. (2007). Data Envelopment Analysis. A Com-
         prehensive Text with Models, Applications, References and DEA-Solver Software. Second edition. New
         York: Springer Science and Business Media.

[Krivonozhko et al., 2009] Krivonozhko, V. E., Utkin, O. B., Saﬁn, M. M., & Lychev, A. V. (2009). On some
         generalization of the DEA models. Journal of the Operational Research Society 60(11), 1518-1527.

[Krivonozhko et al., 2004] Krivonozhko, V. E., Utkin, O. B., Volodin, A. V., Sablin, I. A., & Patrin, M. (2004).
         Constructions of economic functions and calculations of marginal rates in DEA using parametric opti-
         mization methods. Journal of the Operational Research Society, 55(10), 1049-1058.
[Krivonozhko et al., 2005; JORS] Krivonozhko, V. E., Utkin, O. B., Volodin, A. V., & Sablin, I. A. (2005).
         About the structure of boundary points in DEA. Journal of the Operational Research Society 56(12),
         1373-1378.

[Volodin et al., 2004] Volodin, A. V., Krivonozhko, V. E., Ryzhikh, D. A., & Utkin, O. B. (2004). Construc-
         tion of Three-Dimensional Sections in Data Envelopment Analysis by Using Parametric Optimization
         Algorithms. Computational Mathematics and Mathematical Physics 44(4), 589-603.
[Krivonozhko et al., 2005; ITCS] Krivonozhko, V. E., Utkin, O. B., Saﬁn, M. M., & Lychev, A. V. (2005).
         Software EﬃVision for analysis of complex systems. Information technologies and computational systems
         3, 85-95. (In Russian)

[Dasgupta, 2006] Dasgupta, S., Papadimitriou, C. H., & Vazirani, U. V. (2006). Algorithms. New York: McGraw-
        Hill Science/Engineering/Math.

[Evtushenko et al., 2009] Evtushenko, Y., Posypkin, M., & Sigal, I. (2009). A framework for parallel large-scale
         global optimization. Computer Science-Research and Development 23(3-4), 211-215.

[Golubeva et al., 2016] Golubeva, Y., Orlov, Y., & Posypkin, M. (2016). A tool for simulating parallel branch-
        and-bound methods. Open Engineering 6(1), 219-224.

[Posypkin et al., 2017] Posypkin, M., & Khrapov, N. (2017). Branch and bound method on desktop grid sys-
         tems. 2017 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering
         (EIConRus), 526-528.

[Sukhoroslov et al., 2015] Sukhoroslov, O., Volkov, S., & Afanasiev, A. (2015). A Web-Based Platform for Publi-
         cation and Distributed Execution of Computing Applications. 14th International Symposium on Parallel
         and Distributed Computing (ISPDC), 175-184.


                                                      18