=Paper= {{Paper |id=Vol-1726/paper-03 |storemode=property |title=Empirical Analysis of Index Tracking Error Minimization Algorithms Based on Stochastic Dominance Principle |pdfUrl=https://ceur-ws.org/Vol-1726/paper-03.pdf |volume=Vol-1726 |authors=Alexey R. Faizliev,Sergei P. Sidorov,Sergei V. Mironov,Andrey A. Khomchenko }} ==Empirical Analysis of Index Tracking Error Minimization Algorithms Based on Stochastic Dominance Principle== https://ceur-ws.org/Vol-1726/paper-03.pdf
     Empirical Analysis of Index Tracking Error
    Minimization Algorithms Based on Stochastic
               Dominance Principle?

          Alexey R. Faizliev, Sergei P. Sidorov, Sergei V. Mironov, and
                             Andrey A. Khomchenko

                             Saratov State University, Russia
                                faizlievar1983@mail.ru



        Abstract. Index tracking strategy is a passive financial strategy aiming
        at replication of a given index or portfolio return. In this study, a solution
        for the problem of index tracking is regarded with account of cardinality
        constraint, i. e. with a restriction on the maximum amount of assets held
        in the portfolio. The article discusses different algorithms to solve this
        problem in l2 -norm, specifically, greedy algorithm, differential evolution
        algorithm, and LASSO-type algorithm. For the empirical analysis we
        used public data relating to the three major market indices: Hang Seng
        (Hong Kong), S&P 100 (USA) and Nikkei 225 (Japan). For comparative
        analysis of greedy algorithm with LASSO-type algorithm and differential
        evolution algorithm stochastic dominance principle was used. At that,
        the comparison of the approaches included in-sample data as well as
        out-of-sample data.

        Keywords: index tracking; decision making, portfolio optimization,
        greedy algorithms, differential evolution algorithms, stochastic domi-
        nance


1     Introduction
                                                               Pn               1/q
For any q > 0 and x = (x1 , . . . , xn )T ∈ Rn , let kxkq := ( i=1 |xi |q )         and
kxk0 = limq→0+ kxkq = (the number of non-zero elements of x). If q ≥ 1 then
kxkq denotes lq -norm of x ∈ Rn . Let n be the number of investable assets. Denote
rti the return of asset i at time t, 1 ≤ i ≤ n, 1 ≤ t ≤ m, R = (rti ) is the m × n
matrix. A portfolio is defined to be a vector of weights, x = (x1 , . . . , xn )T ∈ Rn .
We do not allow the portfolio changes over time and do not take into account
transaction costs. We will assume that

 1. one unit of capital is available, i. e. xT 1n = 1, where 1n denotes the vector
    from Rn in which every component is equal to 1;
 2. short selling is allowed, i. e. weights xi can be negative.
?
    The results were obtained within the framework of the state task of RFBR (grant
    14-01-00140).
    Let It be the index return at time t, 1 ≤ t ≤ m, and I = (I1 , . . . , It )T ∈ Rm .
In the traditional index tracking optimization, the objective is to find a port-
folio which has minimal tracking error variance, the sum of squared deviations
between portfolio returns and market index returns (see e. g. [20]):

                                   1
                    x∗ = arg min     kI − Rxk22 s. t. xT 1n = 1.                   (1)
                                   m

    It should be noted that the standard Markovitz model is a special case of
index tracking portfolio model (1) (see, for example [4, 25]). Since the problem
(1) is the problem of convex optimization, it can be easily solved by Lagrange
method.
    Instead of squared deviations, the absolute error is also presented in [9, 13,
14, 18, 19, 21, 23, 28] and is used in practice.
    In this paper we will examine three algorithms for solving the problem (1)
with the cardinality constraint:

                          1 T
           x∗ = arg min     kI 1n − Rxk22 s. t. xT 1n = 1, kxk0 ≤ K,               (2)
                          m

where K is the limit on the number of assets in the portfolio with non-zero
weights. It is supposed that K is substantially smaller than n, K  n.
    Index tracking problem with cardinality constraint is NP-hard problem and
it usually requires the development of heuristic algorithms such as genetic algo-
rithms and differential evolution algorithm [1,8–12,16]. These algorithms, though
providing a sufficiently accurate solution of the problem, are associated with sig-
nificant time costs connected with algorithm operation. Good reviews can be
found in [1, 5, 17]. Greedy algorithms also proved their effectiveness [7, 25]. On
the other hand, greedy algorithms does not necessarily yield an optimal solution.
    In section 2 we describe three algorithms for solving the problem (2):

 – the greedy algorithm (GA),
 – the LASSO-type algorithm (LASSO),
 – the differential evolution algorithms (DE).

    Therefore, our comparative analysis will be based on algorithms from three
different classes of algorithms: GA is a trajectory method, DE-algorithm is a
population-based method and LASSO-type algorithm is method with the cardi-
nality constraint relaxation.
    The greedy algorithm presented in this paper uses the adaptation of ideas
of [25]. In section 3, using a technique for comparative analysis based on both first
order stochastic dominance and second order stochastic dominance principles,
we compare the performance of three different portfolios obtained by the three
algorithms for index tracking problem (2).
2      Algorithms for Minimization of Index Tracking Error
       in l2 -norm with the Cardinality Constraint

2.1     Greedy algorithm for l2 -norm minimization with regularization

Let N = {1, . . . , n} be the index set of investable assets. Problem (2) is the
special case (with τ = 0) of the following problem

             x∗ = arg min kI − Rxk22 + τ kxk22 s. t. xT 1n = 1, kxk0 ≤ K,                  (3)
                          x


where τ is a positive parameter. The regularization term τ kxk22 allows to use the
least squares estimator even in the case of multicollinearity in the matrix R [25].
The model with τ = 0 was examined in [6, 27]. The greedy algorithm for solving
the problem (3) in l2 -norm was studied in [25]. The algorithm at each step adds
to our portfolio the asset that is the closest to the index. The process continues
until we reach the cardinality K.
    Let Mk ⊂ N be the set of indices corresponding to k non-zero elements of x.
Let ReM be a submatrix of R with dimension (m × |Mk |). Then the problem (3)
        k
with xi = 0 for i ∈ N \ Mk can be rewritten as

          e∗ = arg min kI − R
          x                 eM x
                              k
                                ek22 + τ ke
                                          xk22 s. t. x              e ∈ R|Mk | .
                                                     eT 1|Mk | = 1, x                      (4)
                     x
                     e


Denote f τ (Mk ) := kI − R
                         eM x
                           k
                             e∗ k22 + τ ke
                                         x∗ k22 . The optimal solution of the prob-
lem (4) can be obtained by Lagrange method:

                                                −1 e T
                         eτMk = (R
                         x       eT R
                                  Mk Mk + τ Ek )
                                    e             (RMk I − λek ),                          (5)
                                                                            −1 eT
                                                         1T   eT e
                                                          k (RMk RMk +τ Ek )  RM I−1
where Ek is the (k × k)-identity matrix and λ =               T
                                                             1 (R
                                                                e T
                                                                                  k
                                                                    eM +τ Ek )−1 1k
                                                                    R
                                                                                     .
                                                             k   Mk    k




    Algorithm 1: Greedy Algorithm in l2
      begin
         · Let M0 = ∅ and k = 1. Set f τ (M0 ) to be sufficiently large;
         · while k ≤ K do
                                      eτMk−1 ∪{s} using (5);
             · ∀s ∈ N \Mk−1 calculate x
                       ∗
             · Select s = arg min f τ (Mk−1 ∪ {s}) and set x      eτMk = x
                                                                         eτMk−1 ∪{s∗ } ;
                               s∈N \Mk−1
             · Set Mk = Mk−1 ∪ {s∗ } and k = k + 1;
         · Set xτG = x
                     eτMK and MG
                               τ
                                 = MK ;
                    τ        τ
         · return xG and MG ;
      end
2.2   LASSO-type Algorithm
Minimization of l1 penalized objective functions can have a sparsifying effect
that has long been observed in research and practice. Minimizing l1 norm is now
a widely used technique for obtaining sparse solutions [4]. The paper [4] uses
LASSO-type approach when the problem (1) is reformulated as a constrained
least-squares regression problem. Let us consider the problem
                xδ = arg min kxk1 s. t. kI − Rxk2 ≤ δ, xT 1n = 1,                       (6)
where δ is a scalar, which is assumed to be selected so that the true vector is
feasible with high probability. The problem of the type (6) without the constraint
xT 1n = 1 is called LASSO regression [26]. LASSO-type estimator is generally
able to accurately estimate nearly sparse vectors. Effective algorithms for sparse
recovery applications problem were developed in the paper [3]. For numerical
solution of the problem (6) we used the set of Matlab templates TFOCS, that
accompany the paper [3].
    The number of assets in the portfolio with non-zero weights (i. e. cardinality
K) of the optimal solution of the problem (6) depends on parameter δ. Bigger
(smaller) values of δ correspond to smaller (bigger) values of K.

2.3   Differential evolution algorithm for l2 -norm minimization
A recent addition to the class of heuristics is the evolutionary method of differ-
ential evolution (DE) proposed by [24].
    In our work to solve the problem (2), we use the algorithm of differential
evolution. DE algorithm is one of the possible “continuous” modifications of
standard genetic algorithms. At the same time, this algorithm has one essential
feature that largely determines its properties. The “inner” random number gen-
erator is used as a source of external noise, and implemented as the difference
between random vectors of the selected population.
    Let N be the number of portfolios in each population. The initial population
P of portfolios xi = (xi0 , . . . , xin )T ∈ Rn , i = 1, . . . , N , is obtained as follows.
First, we randomly generate N vectors y i from Rn . For each y i we set to zero
n − K elements yji of vector y i that are closest to zero (the cardinality of y i
                                             Pn
becomes K) and then we set xij = yji /( s=1 ysi ) to gain restriction xT 1n = 1.
    Portfolios xi refer to the point of n-dimensional space which defines the
                     1
objective function m   kI − Rxk2 that we want to minimize. At each iteration,
the algorithm produces a new generation of portfolios (population) randomly
combined from portfolios of the previous generation. Portfolios of new generation
is generated in a few steps. First, for each i = 1, . . . , N we randomly select three
different portfolios xa , xb , xc among the portfolios of the previous generation.
Then, we calculate
                          x̃ij = xaj + (F + z1 )(xbj − xcj + z2 )

where x̃ij , xaj , xbj , xcj are j-th components of x̃i , xa , xb , xc vectors respectively.
Parameter F is a positive real constant from the interval [0, 2], which manages
the increasing influence of the difference xbj − xcj + z2 in the result vector, z1
and z2 are either equal to zero with small probabilities (for example, 0.001 and
0.002, respectively), or they are normally distributed random variables with
mean zero, and a small standard deviation (i. e. 0.002). Then we set to zero
                       i                        i
n − K elements  Pnof x̃ i (the cardinality of x̃ must be equal to K), and after it we
      i     i
set x̃j = x̂j /( s=0 x̃s ) to fulfil the budget constraint.
    Parameters z1 and z2 are optional parameters of differential evolution algo-
rithm; they make “noise” in the calculation of the resulting vector which helps
to avoid falling into local extremes.
    Component x̂ij of vector x̂i replaces xij with probability π and the portfolio
  i
x̂ goes into the next generation if the following conditions are satisfied:

                            kI − Rx̂i k2 < kI − Rxi k2 .                         (7)


    Evolution of the population corresponds to the dynamics of a “swarm of
midges” (i. e. random point clouds). The cloud is moving along the relief of
optimized function, repeating landscape features. In the case of falling, it takes
the shape of the ravine and the points distribution is such that the expectation
of the difference between two random vectors is directed along the long side of
the ravine. This provides rapid movement along the narrow ravines. In similar
conditions the gradient methods have vibrational dynamics “from wall to wall”.
                          1
    The pseudo-code for m   kI − Rxk2 -minimization using differential evolution
algorithm is shown below.


3     Empirical Results
3.1   Data description
In our empirical analysis we use publicly available data relating to three major
market indices, that can be obtained from the OR-Library of [1, 2]. The three
market indices are the Hang Seng (Hong Kong, n = 31), DAX 100 (Germany,
n = 85) and the Nikkei 225 (Japan, n = 225) for m = 290 time periods each
(weekly data), taken from [1]. The summary statistics of the daily log-returns of
the indices are presented in Table 1. Table 1 shows that the return time series
exhibit the typical patterns of financial times series: mean values around zero,
light asymmetry and fat tails.
    The data used in this paper is given in the form of matrices of asset prices.
We transformed the original data sets into matrices of asset returns. It is widely
accepted to use of the price ratio in order to derive the rate of returns, instead
of using absolute asset price relations.
    The index tracking problem with cardinality constraint were implemented
using Matlab software, as well as built-in and specially developed functions. All
simulations were run in Matlab. The system runs under MS Windows 10 64-bit
and in our computational work we used an AMD FX-8350 pc with a 4.00 GHz
processor and 8.0 GB RAM.
 Algorithm 2: Differential evolution algorithm in l2
      begin
         · Generate N randomly distributed y i ∈ Rn , i = 1, . . . , N ;
         · ∀i, set yji = 0 for n − K closest to 0 values of yji ;
         · ∀i, set initial population P as xij = yji /( n           i
                                                             P
                                                               s=1 ys ), j = 1, . . . , n;
         · set L to the number of iterations;
         · while t ≤ L do
              for each xi , i = 1, . . . , N , from P do
                  · select 3 random vectors xa , xb , xc ;
                  for each j of xij do
                       · with probability π1 : z1,j ← N (0, σ1 ), else z1,j = 0;
                       · with probability π2 : z2,j ← N (0, σ2 ), else z2,j = 0;
                       · uj ← U (0, 1);
                       · if uj ¡ 1 − π then x̃ij = xij ;
                       · else x̃ij = xaj + (F + z1,j )(xbj − xcj + z2,j );
                       · ∀x̃i , set x̃ij = 0 for n − K closest to 0 elements;
                       · ∀x̃i ∈ P , set x̂i = x̃i / s x̃is ;
                                                   P

                       · if conditions (7) are satisfied then x̂i replaces xi in P ;
                      ∗
                                1
         · search xi = arg mini m kI − Rxi k2 ;
                     ∗
         · return xi ;
      end


                 Table 1. Summary statistics of the indices’ weekly returns

                  Data set        n    m mean, % std, % skew kurt min max
                  1 Hang Seng 31 290          0.42    3.32 -0.04 3.85 -0.12 0.11
                  2 DAX 100 85 290            0.25    2.03 -0.21 3.72 -0.07 0.07
                  5 Nikkei 225 225 290       -0.01    2.85 0.44 4.85 -0.11 0.12



3.2      Comparison using stochastic dominance approach

Description of stochastic dominance principle. To compare the approaches
examined in this study we used stochastic dominance principle enabling a choice
to be made in favour of one or another method (portfolio) [15]. Its peculiarity
lies in the fact that it does not require precise knowledge of the investor’s utility
function, it only has to be monotonous and not decreasing [22].
     Before proceeding to the definition of stochastic dominance, let us consider
the concept of a dominant portfolio. The portfolio is considered to be dominant,
i. e. preferred over other portfolio, if it has a higher level of return at the same
level of risk or lower risk for the same expected return than another portfolio.
     In this study, we will use stochastic dominance of first and second order for
the analysis of the resulting portfolios. A more detailed description can be found
in the book [19].
    Let us denote by F1 and F2 distribution functions of random variables (port-
folio returns) X1 and X2 accordingly. If on the interval [a, b] (random variable
support) the inequality F1 ≤ F2 is satisfied, i. e. return distribution function of
one portfolio does not exceed return distribution function of the other portfolio,
we can say that there is stochastic dominance of the first order of one portfolio
over the other (or random variable X1 over X2 ), and denote X1 I X2 .
    However, the situation may not be unambiguous, i. e. return distribution
functions of the first and second portfolios may overlap. Therefore we can not
say with certainty which of the portfolios is preferable for the investor. In this
case the evaluation can be performed on the basis of stochastic dominance of
the second order.
    We say that this is the case of stochastic dominance of the second order of
random variable X1 over X2 and use the notation X1 II X2 , if
                       Z x            Z x
                           F1 (y)dy ≤     F2 (y)dy, x ∈ [a, b].
                        a               a
    Thus, the criterion of stochastic dominance of the second order is based not
on the comparison of portfolio return distribution functions but on the integrals
of these functions, i. e. areas under distribution functions. It can also be said that
the first portfolio is preferable to the second if the cumulative distribution func-
tion of its return never exceeds, and at least in one case is less than cumulative
distribution function of the second portfolio.
    As follows from the second-order dominance, dominance of the first order
of one portfolio over the other automatically assumes its stochastic dominance
of the second order as well. Thus, the condition of the second-order stochastic
dominance is a weaker condition.
    In summary, we can note that in comparison with other methods of assess-
ment stochastic dominance gives the investor a more general approach to the
assessment of risky portfolios.
    We determine the optimal model using a window of 100 observations (weeks)
and leave it intact for further 10 out-of-sample trading weeks for testing purposes.
Then, this (in-sample) window is shifted forward by 10 weeks, and a new portfolio
of solution for index tracking problem is determined using a window of the new
100 observations, and then again it is left unchanged for further 10 out-of-sample
weeks, and so on. Thus, portfolios are recalculated once every 10 weeks. It should
be noted that the comparison of models for in-sample data was held on the last
10 observations (in-sample) of the 100-observations window. It was done so to
ensure that in-sample and out-of-sample samples were of the same dimension.
    To compare these approaches in terms of stochastic dominance, all in-sample
and out-of-sample data were joined. As a result, two time series of returns had
been received with respect to the index of 190 weeks in length, respectively for
in-sample and out-of-sample data. Stochastic dominance principle was used for
such time series.

Comparative analysis of greedy algorithm and LASSO-type algorithm.
Table 2 shows comparative results of stochastic dominance for greedy algorithm
and LASSO-type algorithm. It should be noted that stochastic dominance of the
first order was not observed for all samples. While stochastic dominance (of the
second order) was observed only for 4 out of 6 out-of-sample data (S&P 100 and
Nikkei 225 for both indicators δ) in favour of greedy algorithm.


Table 2. Comparison results of LASSO-type (abbrev. L) and the greedy algorithm
(abbrev. G) for three data sets

                              δ = 0.9                    δ = 0.25
                      In-Sample Out-of-Sample    In-Sample Out-of-Sample
         Hang Seng         5 to 10 stocks             12 to 17 stocks
                         —              —           —               —
          S&P 100          8 to 16 stocks             19 to 34 stocks
                         —         G II L          —          G II L
         Nikkei 225       10 to 20 stocks            31 to 40 stocks
                         —         G II L          —         G II L



    For clarity, Fig. 1(a) demonstrates the comparison of return distribution
functions for portfolios built on greedy algorithm and LASSO-type algorithm for
out-of-sample data Nikkei 225 (δ = 0.25). The graph shows that the probability
distribution functions overlap, i. e. stochastic dominance of the first order is not
possible. Fig. 1(b) represents the comparison of cumulative distribution functions
for the same data. The graph shows that accumulated distribution function
of greedy algorithm is always lower than accumulated distribution function of
LASSO-type algorithm, i. e. we have stochastic dominance of the second order,
XGreedy II XLASSO .
    The following two figures 1(c) and 1(d) show an example for out-of-sample
data Hang Seng (δ = 0.25), when we can not claim the presence of stochastic
dominance. Namely, both probability distribution functions and cumulative re-
turn distribution functions of the portfolios built using greedy algorithm and
LASSO-type algorithm overlap.


Comparative analysis of greedy algorithm and differential evolution
algorithm. Table 3 represents comparative results of stochastic dominance
for greedy algorithm and differential evolution algorithm. It can be seen that
stochastic dominance of the first order was not observed in all samples. If we
compare the algorithms in terms of stochastic dominance of the second order,
we will see that preference is given to differential evolution algorithm (in 1 out
of 6 in-sample data, and in 3 out of 6 out-of-sample data the portfolio built on
differential evolution algorithm stochastically dominates over (the second order)
portfolio built on greedy algorithm). While inverse stochastic dominance is ob-
served only in 1 case for in-sample data, and in 1 case for out-of-sample data
   1                                                           1.2 · 10−2
                  LASSO                                                                LASSO
                  Greedy                                        1 · 10−2               Greedy
 0.8

                                                                8 · 10−3
 0.6
                                                                6 · 10−3
 0.4
                                                                4 · 10−3

 0.2
                                                                2 · 10−3


   0                                                                   0
            -2        -1    0          1       2        3                            -1.5   -1    -0.5          0   0.5   1      0.15
                                 (a)                   ·10−2                                             (b)                  ·10−2

   1
                  LASSO                                        1.5 · 10−2              LASSO
                  Greedy                                                               Greedy
 0.8


                                                                1 · 10−2
 0.6


 0.4
                                                                5 · 10−3

 0.2


   0                                                                   0
            -2        -1    0          1       2        3                            -1.5   -1    -0.5          0   0.5   1      0.15
                                 (c)                   ·10−2                                             (d)                  ·10−2

  1
                                                                2.5 · 10−2
                  DE                                                                    DE
                  Greedy                                                                Greedy
 0.8
                                                                  2 · 10−2


 0.6                                                            1.5 · 10−2


 0.4                                                              1 · 10−2


 0.2                                                              5 · 10−3


  0                                                                         0
       -4        -3    -2   -1         0   1       2       3                    -4     -3    -2     -1          0    1    2       3
                                 (e)                   ·10−2                                              (f)                 ·10−2



 Fig. 1. Return distribution functions and cumulative return distribution functions



(Nikkei 225 for K = 20). It is possible that differential evolution algorithm in
this case was worse due to insufficient number of operations, or populations,
which would allow it to produce a more accurate solution.
    By way of illustration, Fig. 1(e) demonstrates the comparison of cumula-
tive return distribution functions of the portfolios built on greedy algorithm and
differential evolution algorithm for out-of-sample data Nikkei 225 (K = 5). The
Table 3. Comparison results of the differential evolution (abbrev. DE) and the greedy
algorithm (abbrev. G) for three data sets

                                K=5                         K = 20
                      In-Sample Out-of-Sample      In-Sample Out-of-Sample
         Hang Seng DE II G         DE II G          —             —
          S&P 100    —                —               —          DE II G
         Nikkei 225  —              DE II G       G II DE      G II DE



graph ahows that probability distribution functions overlap, i. e. there is no first-
order stochastic dominance. The following figure 1(f) shows comparison of accu-
mulated distribution functions for the same data. The graph demonstrates that
accumulated distribution function of greedy algorithm is always higher than ac-
cumulated distribution function of differential evolution algorithm, i. e. we have
stochastic dominance of the second order (XDE II XGreedy ).


4   Conclusion

Summing up the results of comparison of greedy algorithm and LASSO-type
algorithm, we should note that in most cases we can not give preference to one
or the other portfolio. However, for 4 out of 12 data sets yet there was the second-
order stochastic dominance in favour of greedy algorithm. Also it is important
to note that greedy algorithm is significantly easier to implement, its time costs
are small and it easily copes with cardinality as compared to the LASSO-type
algorithm.
    As for greedy algorithm and differential evolution algorithm comparison, it
should be said that the portfolios built on greedy algorithm and differential evo-
lution algorithm are not significantly different in the selection of assets, and, as
a rule, without using short sales. Moreover, though the portfolios built on differ-
ential evolution algorithm as a whole stochastically dominate over (the second-
order) portfolios built on greedy algorithm, greedy algorithms significantly sur-
pass differential evolution algorithms in terms of ease of implementation and
algorithm execution time.


References

 1. Beasley, J., Meade, N., Chang, T.J.: An evolutionary heuristic for the index track-
    ing problem. European Journal of Operational Research 148(3), 621–643 (2003)
 2. Beasley, J.E.: Or-library: distributing test problems by electronic mail. Journal of
    the Operational Research Society pp. 1069–1072 (1990)
 3. Becker, S., Candès, E.J., Grant, M.C.: Templates for convex cone problems with
    applications to sparse signal recovery. Math. Program. Comput. 3(3), 165–218
    (2011)
 4. Brodie, J., Daubechies, I., De Mol, C., Giannone, D., Loris, I.: Sparse and stable
    Markowitz portfolios. Proceedings of the National Academy of Sciences of the USA
    106(30), 12267–12272 (2009)
 5. Canagkoz, N., Beasley, J.: Mixed-integer programming approaches for index track-
    ing and enhanced indexation. European Journal of Operational Research 196(1),
    384–399 (2008)
 6. Chang, T.J., Meade, N., Beasley, J., Sharaiha, Y.: Heuristics for cardinality con-
    strained portfolio optimisation. Computers and Operations Research 27(13), 1271–
    1302 (2000)
 7. Das, A., Kempe, D.: Submodular meets spectral: Greedy algorithms for subset
    selection, sparse approximation and dictionary selection. In: Getoor, L., Scheffer,
    T. (eds.) Proceedings of the 28th International Conference on Machine Learning
    (ICML-11). pp. 1057–1064. ACM, New York, NY, USA (June 2011)
 8. Derigs, U., Nickel, N.H.: Meta-heuristic based decision support for portfolio op-
    timization with a case study on tracking error minimization in passive portfolio
    management. OR Spectrum 25, 345–378 (2003)
 9. Gilli, M., Kellezi, E.: Financial Engineering, E-Commerce, and Supply Chain, chap.
    The threshold accepting heuristic for index tracking, pp. 1–18. Kluwer Academic,
    Dordrecht (2002)
10. Gilli, M., Winker, P.: Handbook of Computational Econometircs, chap. Heuristic
    optimization methods in econometrics, pp. 81–120. Wiley: Chichester (2009)
11. Jeurissen, R., van den Berg, J.: Index tracking using a hybrid genetic algorithm.
    In: Computational Intelligence Methods and Applications, 2005 ICSC Congress
    on. pp. 1–6 (2005)
12. Jeurissen, R., van den Berg, J.: Optimized index tracking using a hybrid genetic
    algorithm. In: Evolutionary Computation, 2008. CEC 2008. (IEEE World Congress
    on Computational Intelligence). IEEE Congress on. pp. 2327–2334 (June 2008)
13. Le Thi, H.A., Moeini, M.: Long-short portfolio optimization under cardinality con-
    straints by difference of convex functions algorithm. J. Optim. Theory Appl. 161(1),
    199–224 (Apr 2014)
14. Li, Y., Yang, X., Zhu, S.S., Li, D.H.: A hybrid approach for index tracking with
    practical constraints. Journal of Industrial and Management Optimization 10(3),
    905–927 (2014)
15. Linton, O., Maasoumi, E., Whang, Y.: Consistent testing for stochastic dominance
    under general sampling schemes. Review of Economic Studies 72(3), 735–765 (2005)
16. Maringer, D., Oyewumi, O.: Index tracking with constrained portfolios. Intell. Syst.
    Account., Finance Mgmt 15, 57–71 (2007)
17. Maringer, D.: Portfolio Management with Heuristic Optimization (Advances in
    Computational Management Science). Springer-Verlag New York, Inc., Secaucus,
    NJ, USA (2006)
18. Peng, Z.: The optimisation on the multi-period mean-average absolute deviation
    portfolio selection in friction market. International Journal of Intercultural Infor-
    mation Management 2(4), 343–352 (2011)
19. Prigent, J.L.: Portfolio Optimization and Performance Analysis. Chapman &
    Hall/CRC, Boca Raton (2007)
20. Roll, R.: A Mean/Variance Analysis of Tracking Error. The Journal of Portfolio
    Management 18(4), 13–22 (1992)
21. Rudolf, M., Wolter, H.J., Zimmermann, H.: A linear model for tracking error min-
    imization. Journal of Banking & Finance 23(1), 85–103 (1999)
22. Scaillet, O., Topaloglou, N.: Testing for stochastic dominance efficiency. Journal of
    Business & Economic Statistics 28(1), 169–180 (2010)
23. Sidorov, S., Faizliev, A., Khomchenko, A.: Algorithms for l1-norm minimisation
    of index tracking error and their performance. Int. J. Mathematics in Operational
    Research X(Y), XX–XX (2016)
24. Storn, R., Price, K.: Differential evolution – a simple and efficient heuristic for
    global optimization over continuous spaces. Journal of Global Optimization 11(4),
    341–359 (1997)
25. Takeda, A., Niranjan, M., ya Gotoh, J., Kawahara, Y.: Simultaneous pursuit of out-
    of-sample performance and sparsity in index tracking portfolios. Computational
    Management Science 10(1), 21–49 (2013)
26. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the
    Royal Statistical Society (Series B) 58, 267–288 (1996)
27. Woodside-Oriakhi, M., Lucas, C., Beasley, J.E.: Heuristic algorithms for the car-
    dinality constrained efficient frontier. European Journal of Operational Research
    213(3), 538–550 (2011)
28. Zhang, P., Zhang, W.G.: Multiperiod mean absolute deviation fuzzy portfolio se-
    lection model with risk control and cardinality constraints. Fuzzy Sets and Systems
    255, 74–91 (2014)