Clustering underlying stock trends
           via non-negative matrix factorization

                Andrea Pazienza1 , Sabrina Francesca Pellegrino2 ,
                    Stefano Ferilli1 , and Floriana Esposito1
                 1
                     Dipartimento di Informatica – Università di Bari
                 2
                     Dipartimento di Matematica – Università di Bari
                              {name.surname }@uniba.it


       Abstract. Building a diversified portfolio is an appealing strategy in
       the analysis of stock market dynamics. It aims at reducing risk in market
       capital investments. Grouping stocks by similar latent trend can be cast
       into a clustering problem. The classical K-Means clustering algorithm
       does not fit the task of financial data analysis. Hence, we investigate
       Non-negative Matrix Factorization (NMF) techniques which, contrary
       to K-Means, turn out to be very effective when applied to stock data.
       In particular, recently developed NMF techniques, which incorporate
       convexity constraints, generate more disjoint latent trend groupings than
       the traditional sector-based groupings. In this paper, the NMF technique
       and its variants are applied to NASDAQ stock data (i.e., daily closing
       prices). Experimental results confirm that (convex ) NMF techniques are
       highly recommended to produce trend based assets and build a good
       diversified portfolio.


1    Introduction
A trader’s purpose is to beat the market and, then, to make money. To achieve
this objective, the trader should be able to predict future stock prices. In this way,
he could determine a self-financing trading strategy that maximizes his portfolio
return [5]. However, because of randomness in the market, creating and manag-
ing successful portfolios of financial assets is a difficult practice. Diversification
theory is the most widely used practice by individuals to develop portfolios. It
is based on the principle of attempting to maximize expected return for a given
amount of risk or, equivalently, to minimize the risk for a given amount of re-
turn [9]. It is one of the most effective ways to get low risk-reward ratios. This
problem can be seen as a clustering process, in which the aim is grouping data
(e.g., stocks) into subgroups of similar behavior (e.g., market trend).
    Clustering is arguably one of the most common steps in unsupervised behav-
ior analysis. With K-Means [8], a classical clustering algorithm, it is not possible
to establish the effectiveness and coherence of the clusters when dealing with
stock data. Therefore, more powerful analysis techniques are required. Matrix
decomposition strategies can overcome this problem, in fact, they can provide
ways to produce cleaner data which may lead to a better interpretation of the
results. Since the closing prices of stocks are definitely non-negative signals, it
makes sense to apply Non-negative Matrix Factorization (NMF) on them.
    In this paper, we use NMF and its variants to learn the components which
drive the stock market, and to construct a diversification method using cluster
analysis of financial assets. We compare NMF with its two variants, Convex NMF
(C-NMF) and Convex-Hull NMF (CH-NMF), obtained by imposing orthogonal
and convex constraints. We investigate the impact of these constraints in real
stock return data.
    Finally, we conclude that CH-NMF yields a more accurate and disjoint repre-
sentation of the data, and allows a better interpretation of clustering. Moreover,
CH-NMF is very useful because it converges in only one iteration, with very low
runtime and, most important, achieving a very small error in Frobenius norm.
    This paper is organized as follows. The next section recalls useful background
information, including related works. Then, Section 3 introduces the mathemat-
ical aspects of NMF, its variants C-NMF and CH-NMF. Section 4 describes
experimental results and evaluates the difference between the methods in terms
of number of iterations, error and clustering representation. Finally, Section 5
concludes the paper.


2   Background and Related Work

Portfolio Diversification is the process of choosing investments in order to reduce
exposure to a particular asset. This is typically done by investing in a variety
of assets, because, if the stock prices do not move together, then a diversified
portfolio of assets will have lower variance than the weighted average variance of
the assets. A straightforward diversification breaks the stock market into several
classic sectors according to the primary activities of a company (such as Basic
Materials, Technology, Financial, Health Care, etc.).
    As the stock evolution is comparable to a stochastic process, stock prices
are determined by fluctuations in underlying or latent trends, which can be
modeled by a Brownian motion [5]. Therefore, stocks in the same sector may
not show similar behavior in the market. So, albeit the sector-based strategy
is simple enough to apply, the portfolio will not ensure the maximum return.
Hence, if one were able to identify and predict the underlying trends from the
stock market data, one would have the opportunity to leverage this knowledge to
obtain genuine portfolio diversification opportunities. In other words, investors
should diversify their money into not only different sectors, but also different
trends (e.g. clusters).
    Unfortunately, the K-Means clustering algorithm, has still some limitations in
the exploitation of financial data. Indeed, in [1] is stated that K-Means clustering
tends to find spherical clusters so that centroid-based clustering does not handle
the noise. Hence, the authors aimed to discover other centroid-based clustering
approaches for financial datasets and to introduce weighted Euclidean distance
instead of standard Euclidean distance to re-evaluate centroid-based clusters, as
to overcome the limitations of K-Means.
    Matrix decompositions, especially NMF, are used in the literature for the
analysis of financial data. In particular, [4] applied a constrained NMF to real
stock data and found that there is a tradeoff between smoothness of trend and
sparsity of the weight matrix. [2] provides a contribution to the Diversification
Theory by comparing Semi-NMF with Sparse-semiNMF applied to a diffusive
model based on the Black&Scholes equation for option pricing. It is deduced
that Sparse-semiNMF outperforms semi-NMF because it better reduces the risk
related to the portfolio selection. Using the multiplicative update rules algo-
rithm, [7] analyzes the behavior of latent trends for different value of the number
of the latent forces and shows that the increase in the underlying force does not
affect the trends of the original forces. In [11] a variant of Sparse-semiNMF with
sum-to-one and smoothness constraints is applied to the portfolio diversification
problem, and results show a disjoint data representation that allows a better
understanding of stock properties.
    In our setting, assume the market is made up of m stocks S1 , S2 , . . . , Sm ;
each stock Si is stored as a row vector whose entries are n daily closing prices.
Suppose there are k latent bases, W1 , W2 , . . . , Wk ; each Wj is a n-dimensional
row vector, which can be thought as a Brownian motion. So it is possible to
express each stock as linear combination of these bases:
                                        k
                                        X
(1)                              Si =         hij Wj ,
                                        j=1

where hij is a non negative real number and indicates the association degree of
the i-th stock with the basis Wj . With matrix notation, (1) can be expressed as

(2)                               S+ = H+ W± ,

where S ∈ Rm×n
             +    , H ∈ R+ m×k
                                (weight matrix) and W ∈ Rk×n ±  (trend matrix).
This strongly recalls the non-negative matrix factorization formulation.


3     Non-Negative Matrix Factorization
The standard definition for non-negative matrix factorization of a matrix S is

(3)                                 S = H W,

where S ∈ Rm×n , H ∈ Rm×k and W ∈ Rk×n , and k ≤ m. Both W and H
must contain only non-negative entries. W is the matrix of factors and H is the
mixing matrix.
   According to [6], each data point, which is represented as a row in S, can
be approximated by an additive combination of the non-negative basis vectors,
which are represented as rows in W , weighted by the components of H. Matrices
H and W are found by solving the optimization problem

(4)                             min kS − H W k2F ,
                               H≥0,
                               W ≥0
where k · kF is the Frobenius norm. The algorithm is expressed in terms of a pair
of update rules that are applied alternately:
                     X Sik                            X         Skj
          Hij = Hij              Wjk ,     Wij = Wij     Hki          .
                        (H W )ik                             (H W )kj
                     k                                  k

Matrices H and W are initialized at random. Various variants and improvements
to NMF have been introduced in recent years [3, 10].

Convex NMF Convex non-negative matrix factorization (C-NMF) [3] allows
the data matrix S to have mixed signs. It minimizes kS − S H W k2F subject to
the convex constraint kHi k1 = 1, H ≥ 0, where S ∈ Rm×n , H ∈ Rn×k and
W ∈ Rk×n . Matrices H and W are updated iteratively, until convergence, using
the following update rules:
                               s
                                  (Y + W )ik + (Y − H W T W )ik
                     Hik = Hik
                                  (Y − W )ik + (Y + H W T W )ik
                                s
                                   (Y + H)ik + (W H T Y − H)ik
                     Wik = Wik
                                   (Y − H)ik + (W H T Y + H)ik
where Y = S T S, and matrices Y + and Y − are given by Yik+ = 21 |Yik | + Yik and
                                                                             

Yik− = 12 |Yik | − Yik respectively. Matrices H and W are initialized at random.
                      

    The convex constraint imposed on H has the advantage that one might in-
terpret the rows of H as weighted sums of certain data points. This means that
these rows can be interpreted as centroids. Moreover, C-NMF solutions are gen-
erally significantly more orthogonal than Semi-NMF solutions.

Convex-Hull NMF Massive datasets are likely to capture even very rare as-
pects of the problem at hand. Along this line, [10] recently introduced a data-
driven Convex NMF approach, called Convex-Hull NMF (CH-NMF), that is fast
and scales extremely well: it can efficiently factorize huge matrices and in turn
extract meaningful “clusters” from massive datasets. The key idea is to restrict
the clusters to be combinations of vertices of the convex hull of the dataset; this
allows to directly explore the data itself to solve the convex NMF problem.
     We consider a factorization of the form S = S H W where S ∈ Rm×n , H ∈
  n×k
R       and W ∈ Rk×n . We further restrict the rows of H and W to convexity,
i.e., kHi k1 = 1 H ≥ 0,       kWj k1 = 1 W ≥ 0.
     In contrast to C-NMF, we consider a convex combination on both H and W .
The task now is to minimize
(5)    kS − S H W k2F ,   s.t. kHi k1 = 1,    H ≥ 0,    kWj k1 = 1,   W ≥ 0.
    This optimization problem is equivalent to projecting the solution in the con-
vex hull of S. Convexity constraints yield latent components with some proper-
ties: first, any data point can be expressed as a convex and meaningful combi-
nation of these basis vectors; this offers interesting new opportunities for data
interpretation. Second, they span a simplex that encloses most of the remaining
data. CH-NMF aims at a data factorization based on the data points residing
on the data convex hull. Therefore, CH-NMF seeks an approximate solution by
subsampling the convex hull, exploiting each data point on the convex hull as a
linear lower dimensional projection of the original data.
    A consequence of the convexity of H and W is that the rows of H tend to
become nearly orthogonal. Requiring orthogonal rows for H produces a non-
correlation between stocks being attracted from different clusters. This indicates
a more accurate clustering and, hence, that the NMF family is competitive with
K-Means for the purposes of clustering financial data. Therefore, the aim of this
paper is to exploit NMF techniques, and in particular the ones with convexity
constraints, in the context of financial data.


4   Evaluations and Comparisons of NMF Techniques

The objective of this work is studying the application of clustering approaches to
determine a trend-based portfolio diversification that is more consistent than the
sector-based one. We ran experiments on stock data gathered from the NASDAQ
Stock Market (http://www.nasdaq.com), and specifically on the closing prices
of 28 stocks in the past 10 years (2518 working days). Table 1 reports the list of
stocks involved in the experiments and their belonging sector. Fig. 1(a) shows
the actual stock prices, useful as a reference to compare the numerical solutions
that we obtain from the different methods. So, we want to identify the subgroups
(clusters) of stocks that show a similar trend. In particular, we want to assess
the performance of NMF, C-NMF and CH-NMF. We try different numbers k
of clusters. Since the considered stocks involve only 8 market sectors overall,
we ran the methods 6 times, one for each k ∈ {3, 4, . . . , 8}. We used a Python
implementation of these methods available at http://pymf.googlecode.com.
    The raw data were preprocessed as log-returns and stored into a matrix S ∈
R28×2518 . In this way the variances of stock data distribution are homogeneous to
allow a better understanding of the graphical results. We compute two matrices
H and W for the given stock data matrix S. H and W are used to identify the k
cluster labels of the stocks: in fact, rows of W regard the cluster centroids while
H is the cluster membership indicator matrix. In other words, sample i is in
cluster j if Hij is the largest value in row Hi,: . The product H W , representing
the reconstruction of S, is a useful way to explicit the difference between the
original stock prices data and the approximated dynamics of transformed data.
    Table 2 reports, for each k, statistics about the results obtained by the NMF
methods: number of iterations for convergence, error estimation in Frobenius
norm, and number of non-empty clusters in the result. We note that, in most
cases, all decomposition methods provide a good reconstruction of data. They
differ in the cost to achieve a good decomposition (the larger the number of
iterations, the more expensive the method). We want to highlight that CH-NMF
is very fast and its estimated Frobenius error is of the same order of magnitude as
the other methods. In fact, even if we set a higher number of maximum iterations,
                         Table 1: Stocks Table
  # Code Company                                     Sector
  1 AA Alcoa Inc.                                    Basic Materials
  2 AIG American International Group, Inc.           Financial
  3 AAPL Apple Inc.                                  Technology
  4 AXP American Express Company                     Financial
  5 BA Boeing Company                                Industrial Goods
  6 CAT Caterpillar, Inc.                            Industrial Goods
  7 DD E.I. du Pont de Nemours and Company           Basic Materials
  8 DIS Walt Disney Company                          Services
  9 GE General Electrics Company                     Conglomerates
  10 HD Home Depot, Inc.                             Services
  11 HON Honeywell International Inc.                Industrial Goods
  12 HPQ HP Inc.                                     Technology
  13 IBM International Business Machines Corporation Technology
  14 INTC Intel Corporation                          Technology
  15 JNJ Johnson & Johnson                           Healthcare
  16 JPM JP Morgan Chase & Co.                       Financial
  17 KO Coca-Cola Company                            Consumer Goods
  18 MCD McDonald’s Corporation                      Services
  19 MSFT Microsoft Corporation                      Technology
  20 MMM 3M Company                                  Conglomerates
  21 MO Altria Group, Inc.                           Consumer Goods
  22 PFE Pfizer, Inc.                                Healthcare
  23 PG Procter & Gamble Company                     Consumer Goods
  24 UTX United Technologies Corporation             Conglomerates
  25 VZ Verizon Communications Inc.                  Technology
  26 WMT Wal-Mart Stores, Inc.                       Services
  27 ABIO ARCA biopharma, Inc.                       Healthcare
  28 AMGN Amgen Inc.                                 Healthcare


                       Table 2: Numerical results
           NMF                    C-NMF                  CH-NMF
k # iter error # clusters # iter error # clusters # iter error # clusters
3 1528 33.7703     3       500 45.7185     2        1 47.5844      2
4 2355 27.1966     4       500 42.4148     2        1 43.9824      4
5 3358 21.0838     5       500 40.2502     2        1 38.6585      4
6 2523 16.9987     4       500 33.5761     2        1 56.4050      4
7 5000 14.6706     6       500 38.6675     2        1 32.5755      5
8 5000 13.5482     7       500 32.1786     2        1 46.2535      5
          (a) Actual stock prices data.                (b) NMF.


                  (c) C-NMF.                         (d) CH-NMF.

            Fig. 1: Original stock data with reconstruction for k = 4


CH-NMF needs only 1 iteration to converge for these particular dataset, as
reported in Table 2. For others techniques involved in the experiments, NMF
requires up to 5000 iterations in the worst case, while C-NMF takes less iterations
than NMF to converge, but it always discovers only two non-empty clusters. This
is very unattractive because, considering only 2 trends will expose investors to
high risks as they would diversify their portfolio making it up with only 2 stocks,
one for each trend. The role of k is to force a representation for the data that
is more compact than its actual form. Assuming a more compact representation
will capture underlying regularities in the data that might be obscured by the
form in which the data is found in matrix S. The target is to achieve a low rank
approximation which ensures a good interpretation of data in terms of clustering
partition, data reconstruction and compactness of representation. Indeed, for a
genuine portfolio diversification, choosing k = 4 represents a fair compromise
between a lower rank approximation and the goal of yielding a good cluster
partition. The reconstruction obtained for k = 4 is shown Fig. 1(b,c,d).
    Now, we focus on the study of W , i.e., the matrix which represents the latent
trends. As shown in Fig. 2, the trend obtained from NMF points out the incre-
ment of fluctuations as k grows up, despite the quality of graphical reconstruction
being good. In Fig. 3 we can see that C-NMF shows too many fluctuations and
does not allow us to compose a good diversified portfolio. Indeed, as shown in
Table 2, for each k, C-NMF provides always only 2 clusters, which reflect the
trends that can be seen in the figures. A reason for this behavior is related to the
fact that C-NMF imposes the convexity constraint only on H. Hence, convexity
for H should be used together with convexity also on W . In fact, CH-NMF is able
to overcome this problem and leads to more regular components (cfr. Fig. 4).
(a) k = 3.          (b) k = 4.          (c) k = 5.


(d) k = 6.          (e) k = 7.          (f) k = 8.

              Fig. 2: Trends for NMF


(a) k = 3.          (b) k = 4.          (c) k = 5.


(d) k = 6.          (e) k = 7.          (f) k = 8.

             Fig. 3: Trends for C-NMF
         (a) k = 3.                  (b) k = 4.                 (c) k = 5.


         (d) k = 6.                  (e) k = 7.                 (f) k = 8.

                           Fig. 4: Trends for CH-NMF


Therefore, we can state that the convexity constraint of H and W provides a
good adjustment: the bases become more disjoint and Frobenius norm decreases
at a speed that depends on the number of iterations. It is important to note that
while all methods try to minimize the same criterion, they impose different con-
straints and thus yield different matrix factors. For example, CH-NMF assumes
W and H to be non-negative matrices and often leads to sparse representations
of the data.
    Another important graphical confirmation of our proposal can be found in
the analysis of colormaps which is a good way to display matrices in scaled colors.
It represents a color-filled table in which every color indicates the weight of each
corresponding matrix component according to a total order relation managed
by a color scale called colorbar. In this way, we can evaluate the components of
greater weight associated with latent trends. More precisely, every Hij indicates
how the i-th stock is related to the row basis Wj . Also in this representation, we
can observe the degree of belonging of each data point to a cluster by selecting,
for each row, the highest element in the colormap. In the case of NMF (see


              (a) NMF.              (b) C-NMF.            (c) CH-NMF.

                           Fig. 5: Colormaps for k = 4
Fig. 5(a)), colormaps display a regular proceeding of data with a high peak in
some rows. This determines the membership of an element in a row to a cluster
in the corresponding column. While, for C-NMF (see Fig. 5(b)) the elements
with the highest color with respect to the colorbar are located in the first or last
column, giving more emphasis to the fact that the resulting clusters are always
2. Regarding colormaps for CH-NMF (see Fig. 5(c)), we can observe a clearer
disjunction of columns, meaning that the resulting clusters are readily visible.


                             Table 3: NMF Clusters
             k=3        k=4      k=5        k=6       k=7       k=8
              3 4 5 6 3 5 8 10 12 14 15      3 4 5 6 3 4 5 6 11 15 17 18
             7 8 9 10 11 15 16 16 17 18 7 8 10 11 7 8 14 16 20 21 22 23
          11 14 15 16 19 20 21 19 20 23 13 14 15 16 19 20 24 24 25 26 28
          17 19 20 21 22 23 24 24 25 26 17 18 19 20
          22 23 24 25 25 26 28           21 22 24 25
                26 28                             28
             12 13 18 6 7 12 13 4 5 8 10       23 26 12 15 23    2458
                       14 17 18 21 22 28                   26       9 10
               1 2 27        27     7 11    1 2 9 27    21 28         27
                        1 2 4 9 3 6 13            12 10 13 17         16
                                                     18 22 25
                                1 2 9 27                   27    3 13 14
                                                     1 2 9 11   1 6 7 12
                                                                      19


    Compared to the K-Means algorithm, used as baseline, the main difference
lies into the creation of clusters. In fact, K-Means always produces exactly k
clusters, while NMF methods generate at most k clusters, as shown in Table 2.
This means that there are centroids which are not attracting stocks.
    After analyzing all decompositions, our main purpose is to obtain clusters
of stocks with the same trend, starting from a matrix decomposition of data
in a such way that W would hold centroids coordinates and H would hold the
relationship degree of different centroids. We collect every i-th row of H, which
corresponds to the i-th stock index for Table 1, into their own membership
cluster in order to discover, for each k, which subgroups of stocks remain unaf-
fected. Thus, the most frequent subgroups of stock data can be chosen as final
outcome of our portfolio diversification strategy. More precisely, it could be pos-
sible to construct a tempting portfolio by selecting the most promising stocks
from each subgroup. We implemented this functionality in MATLAB with the
objective to take out a cluster matrix by varying both of k and the decompo-
sition method. In Tables 3-5, we show the resulting clusters of stock data: we
see that clusters of stocks persist across the different decomposition methods.
To give some concrete examples, stocks with index 1, 2, 9, 27 (and often 12
too) are always grouped together. They correspond to red indexes in the tables.
                            Table 4: C-NMF Clusters
             k=3      k=4      k=5        k=6       k=7       k=8
           1 2 12 27 1 2 9 12 1 2 9 12      1 2 27 1 2 6 9 1 2 9 12 27
                           27       27             12 23 27
           345673456734567                3456734578 34567
           8 9 10 11 8 10 11 8 10 11 8 9 10 11 10 11 13        8 10 11
            13 14 15 13 14 15 13 14 15 12 13 14 15 14 15 16   13 14 15
            16 17 18 16 17 18 16 17 18 16 17 18 19 17 18 19   16 17 18
            19 20 21 19 20 21 19 20 21 20 21 22 23 20 21 22   19 20 21
            22 23 24 22 23 24 22 23 24 24 25 26 28 24 25 26   22 23 24
            25 26 28 25 26 28 25 26 28                   28   25 26 28


They correspond respectively to Alcoa Inc. (Basic Materials), American Inter-
national Group Inc. (Financial), General Electrics Company (Conglomerates),
ARCA biopharma Inc. (Healthcare), HP Inc. (Technology). Other stocks which
are often grouped together are depicted in the Tables with different colours.
    Summing up, the obtained clusters demonstrate successful application of CH-
NMF to the analysis of financial data. This means that CH-NMF is robust
in the case of analysis of stock market. Moreover, it provides a trend-based
diversification containing groups of different sectors. The most interesting result
is that the stocks of the same sector is not necessarily assigned into the same
cluster and vice versa, which is of potential use to guide diversified portfolio.


                           Table 5: CH-NMF Clusters
        k=3           k=4          k=5         k=6         k=7         k=8
       1 2 9 12 27     1 2 9 27    1 2 9 27 1 2 9 12 27       129          2 27
        345678        3 4 5 7 8 3 5 7 8 10     34578        3 4 6 7 4 8 10 11
     9 10 11 13 14 10 11 14 15     11 18 21 11 14 15 16    13 14 17 13 15 16 17
    15 16 17 18 19 16 17 18 19     22 24 25 17 20 22 23    20 23 24 18 20 21 22
    20 21 22 23 24 20 21 22 23              24 25 26 28       25 26 23 24 25 26
          25 26 28 24 25 26 28                                               28
                           6 13 4 6 12 14          6 13 5 8 10 11      1 5 9 12
                                15 16 17 19                15 16 18       14 19
                                20 23 26 28             19 21 22 28
                             12          13 10 18 19 21          12         67
                                                                 27           3


5   Conclusions
Constructing a diversified portfolio, in which the correlation between constituent
asset classes and investment strategies is meaningfully low, can be challenging,
in order to reduce the exposure to risk by investing in a variety of assets. Our
aim is to group stocks having similar trend. This can be cast as a clustering prob-
lem in data mining that we solve with NMF techniques. We investigate NMF
and its variants with convexity constraints to improve the exploitation of similar
stock trends. In particular, we show that, for this task, CH-NMF is a very fast
and scalable in terms of speed and reconstruction quality. Our extensive exper-
imental evaluation shows that NMFs better point out the clustering properties,
additionally yielding very low error in Frobenius norm and high efficiency in
terms of convergence time. Furthermore, we compared the resulting clusters to
check whether frequent itemsets of stock stay together still while the number
of requested clusters changes. The task of prediction is not applicable for NMF
techniques because the number of clusters to be discovered is given in input.
    As future work, we will use more datasets from different markets and will
investigate further decomposition techniques that can further improve the effec-
tiveness of clustering stock data and impose other penalty constraints in order to
achieve a better portfolio diversification strategy, reduce the risk of investments
and, hence, beat the market.

Acknowledgments
This work was partially funded by the Italian PON 2007-2013
project PON02_00563_3489339 ‘Puglia@Service’.


References
 [1] F. Cai, N. Le-Khac, and M. Kechadi. Clustering approaches for financial data
     analysis: a survey. In Proceedings of the International Conference on Data Mining
     (DMIN), pages 1–7, 2012.
 [2] R. De Fréin, K. Drakakis, and S. Rickard. Portfolio diversification using subspace
     factorizations. In Information Sciences and Systems, 2008. CISS 2008. 42nd An-
     nual Conference on, pages 1075–1080. IEEE, 2008.
 [3] C. H. Q. Ding, T. Li, and M. I. Jordan. Convex and semi-nonnegative matrix
     factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence,
     32(1):45–55, Jan 2010.
 [4] K. Drakakis, S. Rickard, R. De Fréin, and A. Cichocki. Analysis of financial
     data using non-negative matrix factorization. International Mathematical Forum,
     3(38):1853–1870, 2008.
 [5] R. Korn and E. Korn. Option pricing and portfolio optimization. Graduate Studies
     in Mathematics, 31:18, 2001.
 [6] D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In
     Advances in neural information processing systems, pages 556–562, 2001.
 [7] T. Liu. Non-negative matrix factorization for stock market pricing. In Biomedical
     Engineering and Informatics. BMEI’09., pages 1–5. IEEE, 2009.
 [8] J. MacQueen et al. Some methods for classification and analysis of multivariate
     observations. In Proceedings of the fifth Berkeley symposium on mathematical
     statistics and probability, volume 1, pages 281–297. Oakland, CA, USA., 1967.
 [9] H. Markowitz. Portfolio selection. The journal of finance, 7(1):77–91, 1952.
[10] C. Thurau, K. Kersting, and C. Bauckhage. Yes we can: simplex volume maximiza-
     tion for descriptive web-scale matrix factorization. In Information and knowledge
     management. CIKM 2010., pages 1785–1788. ACM, 2010.
[11] J. Wang. Stock trend extraction via matrix factorization. In Advanced Data
     Mining and Applications, pages 516–526. Springer, 2012.