Clustering of exchange rates and their dynamics
     under different dependence measures

                      Martı́ Renedo and Argimiro Arratia

             Universitat Politècnica de Catalunya, Barcelona, SPAIN
            marti.renedo@estudiant.upc.edu, argimiro@cs.upc.edu


      Abstract. This paper proposes an improvement to the method for clus-
      tering exchange rates given by D. J. Fenn et al, in Quantitative Finance,
      12 (10) 2012, pp.1493-1520. To deal with the potentially non linear nature
      of currency time series dependence, we propose two alternative similarity
      metrics to use instead of the one used in the aforementioned paper based
      on Pearson correlation. Our proposed similarity metrics are based upon
      Kendall and distance correlations. We observe how each of the newly
      adapted clustering methods respond over several years of currency ex-
      change data and find significant differences in the resulting clusters.


1   Introduction
The foreign exchange market (Forex, or FX) is a decentralized worldwide mar-
ket for the trading of currencies. With a global activity of $5.3 trillion per day
in 2013, it is the largest market in the world by a significant margin, and con-
sequently the most liquid. As opposed to other markets, the foreign exchange
market operates 24 hours a day, five days per week (from Sunday 22:00 GMT
to Friday 22:00 GMT). This is due to the location of market centres in different
time zones. It is worth noting, though, that the largest financial centres (United
Kingdom, United States, Singapore and Japan) account for most of the trade
(71% in April 2013 [2]). The most traded currency in the foreign exchange market
is the US dollar, which was in one side of 87% of trades in April 2013, followed
by the euro (33% in April 2013, but having lost share compared to the previous
years) and Japanese yen (23% share). Currencies from emerging economies, like
the Mexican peso or the Chinese yuan have significantly increased their activity.
    In this paper we summarise our study and implementation of the clustering
algorithm proposed in [7] together with two variants which show potential for
improvement on the accuracy of clustering exchange rates series. The clustering
method consists of applying generalised modularity minimisation techniques to
a weighted similarity graph obtained from the Pearson correlation between the
time series of returns of the exchange rates. However, this method will fail to
detect similarities between exchange rate returns that are not linearly correlated,
which is more often than not the case for financial time series. We apply the
clustering algorithm to similarity graphs built upon the Kendall ranks and the
distance correlations, which do detect non linear dependence, and analyse the
differences in the results.
2     The Forex network

Given the price time series of a currency exchange rate, we consider its log
returns series to analyse its behaviour.

Definition 1. The log return (or continuously compounded return) rt of an as-
set of price Pt over a time step [t − 1, t] is defined as:

                                     Pt
                          rt = ln        = ln Pt − ln Pt−1
                                    Pt−1

    Once all time series of returns have been obtained, for each pair of cur-
rencies a measure of similarity between them is calculated. The Forex network
will be built with every exchange rate return series as a node, and edges be-
tween exchange-nodes weighted accordingly to their similarity. This results in a
weighted undirected complete graph.


2.1   Choice of Currencies

The currencies included in our study will be the 13 of the most traded as of
April 2013 [2], and for which data is available from the U.S. Federal Reserve
Economic Data. These are: US dollar (USD), euro (EUR), yen (JPY), pound
sterling (GBP), Australian dollar (AUD), Swiss franc (CHF), Canadian dollar
(CAD), Mexican peso (MXN), Chinese yuan (CNY), New Zealand dollar (NZD),
Swedish krona (SEK), Hong Kong dollar (HKD) and Singapore dollar (SGD).
    Thus, our Forex network consists of n = 78 nodes, each node representing a
pairwise exchange rate taken from our selected set of currencies. While it would
be possible to just use one base currency and assign one node to every other
currency corresponding to its exchange rate over the base, this could overlook
interactions between some of the currencies. Moreover, the resulting network
would strongly depend on the choice of base currency. Our construction (which
follows [7]) prevents those issues and introduces all the interactions between
every pair of the selected currencies into the analysis.
    However, since it is difficult to find data for the exchange rate of every pair
of currencies, we downloaded all exchange rates with the US dollar, and the
other exchange rates are obtained by converting through the US dollar as an
auxiliary step. That is, to obtain the exchange rate between currencies XXX
and Y Y Y where neither of them are U SD, we define XXX/Y Y Y = XXX/U            SD
                                                                        Y Y Y /U SD .
This procedure is also implemented in [7]. While it is possible to obtain data
for the exchange rates of most currency pairs, the fact that markets around the
world have different opening hours and holidays would mean that the times at
which data was taken would most likely present mismatches between all pairs.
Taking only data from the US Federal Reserve limits the study to the New York
opening hours but eliminates any potential inconsistencies and makes calculating
correlations between pairs more accurate.
2.2   Similarity Metrics
To be able to form clusters of similar exchange rates, the edges of the network
will have weights assigned according to a similarity measure. In this context,
similarity is a metric built from a statistic measure of dependence, and which
verifies the properties described in [5]. We analyse three measures of dependence
from which we can construct a similarity metric.

Pearson correlation. The approachp      used in [7] is based on the Pearson cor-
            i j           i j
relation ρ(r , r ) = Cov(r , r )/ Var(r )Var(rj ) of returns of exchange rates ri
                                        i

and rj over the given time interval. Then, the weighted similarity matrix among
the exchange rate returns Aρ is given by
                                       1
                              Aρij =     (ρ(ri , rj ) + 1) − δij
                                       2
which scales the Pearson correlation from [−1, 1] to [0, 1], while the Kronecker
delta δij removes self-edges. In the graph with adjacency matrix Aρ exchange
rates with positively linearly correlated returns will be connected by edges of
weight close to 1, and weight near 0 if the correlation is negative. Edges con-
necting non correlated exchanges will have weights closer to the center of [0, 1].

Kendall rank correlation. A correlation measure alternative to Pearson’s is
the Kendall correlation (or rank correlation) τ [1], which measures concordance
of the variables instead of linear relations between them:
Definition 2. Given two random variables X and Y, their Kendall correlation
coefficient is τ (X, Y ) = pc − pd , where for any two independent pairs of values
(Xi , Yi ), (Xj , Yj ), pc = P ((Xj −Xi )(Yj −Yi ) > 0) and pd = P ((Xj −Xi )(Yj −
Yi ) < 0), are the probabilities of these pairs being concordant and discordant
respectively.
Definition 3. For two series of returns ri , rj over m time steps, the Kendall
correlation is estimated by
                                     X       sgn(rti − rsi ) − sgn(rtj − rsj )
                 τm (ri , rj ) =
                                                       n(n − 1)
                                   1≤s<t≤m

    The Kendall correlation coefficient τ (ri , rj ) takes values in [−1, 1] (equalling
−1 and 1 when ri and rj are completely discordant or concordant, respectively)
so, as before, we can define the adjacency matrix Aτ :
                                       1
                              Aτij =     (τ (ri , rj ) + 1) − δij
                                       2
   For both the networks defined by Aρij and Aτij and given two of their exchange
rates XXX/Y Y Y and ZZZ/T T T , the following equality holds
                                                           
                     XXX ZZZ                     XXX T T T
                A·        ,         = 1 − A·           ,        ,
                     Y Y Y TTT                   Y Y Y ZZZ
so if XXX/Y Y Y is similar to ZZZ/T T T (their mutually adjacent edge has a
high -close to 1- weight), it will be considered dissimilar to its inverse T T T /ZZZ
(their mutually adjacent edge will have a low weight, close to 0). This means
that, since we cannot a priori determine whether two exchange rates will be
correlated directly or with one of them inverted, we have to include all inverses
in the network.


Distance correlation. Another possible choice is the distance correlation R,
introduced in [14], and which presents several advantages. For all X,Y R satisfies
0 ≤ R(X, Y ) ≤ 1, being only equal to 0 when X and Y are independent (this is
not satisfied by most correlation measures, like Pearson’s, for which a correlation
of 0 doesn’t necessarily imply independence). The distance correlation can detect
both linear and nonlinear correlations, which is relevant for this study because
while many exchange rates tend to influence each other’s prices, this relation
doesn’t necessarily have to be linear.

Definition 4. The empirical distance covariance between two samples X,Y is
                           n
                         1 X
defined by Vn2 (X, Y ) = 2    Bkl Ckl ., where
                        n
                                k,l=1

                                        n                   n                       n
                                     1X                   1X                     1 X
     bkl = |Xk − Xl |,      b̄k· =      bkl ,    b̄·l =      bkl ,      b̄·· =       bkl ,
                                     2                    n                      n2
                                      l=1                  k=1                     k,l=1


and Bkl = bkl − b̄k· − b̄·l + b̄·· . Analogously define ckl = |Yk − Yl |, c̄k· , c̄·l , c̄·· and
Ckl = ckl − c̄k· − c̄·l + c̄·· for Y .
                                                            n
                                                        1 X 2
   Vn2 (X) is defined by V 2 (X) = V 2 (X, X) = 2               Bkl .
                                                       n
                                                                k,l=1


Definition 5. The empirical distance correlation Rn (X, Y ), between two sam-
ples X and Y , is the square root of
                          ( V 2 (X,Y )
               2
                             √ 2n            , Vn2 (X)Vn2 (Y ) > 0
            Rn (X, Y ) =              2 (Y )
                               Vn (X)Vn
                            0,                 Vn2 (X)Vn2 (Y ) = 0

Remark 1. It should be observed that the distance correlation is not a distance
in the metric sense. It is by definition a correlation among all distances of the
samples, and one can easily see that R(X, X) = 1, violating the identity of
indiscernibles of a distance. The distance correlation is in fact a similarity metric.

     Using R for studying currency exchange rates the following property is sat-
isfied: Given XXX/Y Y Y , ZZZ/T T T exchange rates,
                                                        
                       XXX ZZZ              XXX T T T
                   R          ,       =R           ,         .
                        Y Y Y TTT            Y Y Y ZZZ
This implies that there is no need to include the inverses of the exchange rates
(as in [7], where it is needed or else some correlations could be overlooked), since
the correlations detected will be the same for XXX/Y Y Y and Y Y Y /XXX.
    Since R already has the properties of a similarity metric, the exchange rate
network is simply built from the matrix of distance correlations, AR , removing
self edges; that is, for each pair of exchange rate returns ri , rj ,

                               AR       i j
                                ij = R(r , r ) − δij


3   Community detection

To partition the graph into communities we implemented Potts method. This
partition method consists on minimising an objective function, the Potts Hamil-
tonian, which evaluates the strength of a partition of the graph. (Considering a
strong partition as one that has strong links -heavy weighted- inside the commu-
nities and weak links between communities.) This can be seen as a generalisation
of the modularity function [10].

Definition 6. (i) The modularity of the partition P of a weighted undirected
graph with adjacency matrix A is given by

                                  1 X
                        Q(P) =         [Aij − Pij ]δ(ci , cj )
                                 2m ij

where ci is the community of the node i in the partition P (so δ(ci , cj ) is 1 when
i and j are in the same community and 0 otherwise), Pij is the expected weight
of the edge ij in a null model and m is the sum of the weights of all edges in the
graph.
(ii) The Hamiltonian of the Potts system of the partition P of a weighted undi-
rected graph with adjacency matrix A is given by
                                   X
                        H(P) = −    [Aij − γPij ]δ(ci , cj )
                                    ij


where γ is a parameter.

                                        1
   Note that when γ = 1, Q(P) = − 2m      H(P), so in this case the problem
of minimising the Potts Hamiltonian is equivalent to the maximisation of the
modularity. This means the Potts method is a generalisation of the modularity
maximisation method. The advantage of this generalisation is that by varying
the value of γ the partition that minimises the Hamiltonian will contain big-
ger or smaller communities (corresponding with lower and higher values of γ
respectively).
3.1   Selecting the null model P
The modularity and Potts Hamiltonian functions depend on null models to es-
timate the expected edge weight, and then form communities where the actual
weight is significantly larger. In the context of community identification in net-
works, a popular choice of null model is the Newman-Girvan model [10]. For the
networks Aρij and Aτij , the Newman-Girvan null model results in
                                ( l A·il )( l A·jl )
                                 P         P
                           ·                           n−2
                          Pij =      P      ·        =
                                       i,j Aij          2n
a constant value for all edges. This happens because, for a given node, the two
edges connecting it to any other exchange rate and its inverse will have sum 1.
Then, the weight of a node (i.e., the sum of the weights of its incident edges)
will depend only on the size of the network and not on the correlations between
exchange rates at a given time step. However, this property is generally not
satisfied by the Newman-Girvan null model of the network AR   ij . For simplicity,
we can use the average edge weight as a uniform null model:
                                        P
                                   R      i,j Aij
                                 P =
                                       n(n − 1)

3.2   Selecting the value of γ
To select an appropriate value of γ for the study, we use a sample network built
from daily data of January 2010 at a single time step for each of the similarity
metrics. The minimisation algorithm is applied for values of γ ranging from 0.4
(all nodes are in a single community) to 2.2 (each node forms its own community)
in 0.01 steps. The total number of communities for each value of γ, with respect
to each of the three dependence measures considered, are shown in Figure 1.
    We can expect to obtain more robust communities by selecting a value of
gamma inside an interval where the number of communities stabilizes (refered
to as plateaus in [7]). Excluding the trivial cases at the extremes of the plot with
just one community containing all nodes and every node in a single community,
the widest non trivial plateaus are the intervals ∼ (1.48, 1.51), ∼ (1.27, 1.33) and
∼ (1.44, 1.52) for the Pearson, Kendall and distance correlations respectively. We
then select values of gamma γρ = 1.495, γτ = 1.3 and γR = 1.48.

3.3   Potts Hamiltonian minimisation algorithm
The algorithm used to minimise the Potts Hamiltonian is an adapted version
of the modularity maximisation algorithm in [3]. For a given node, we want to
see if moving it to another community would give an overall decrease in the
Hamiltonian. The change in the Hamiltonian caused by moving the node i to a
community C is given by the expression:
                                                               
                       1  X                   X
             ∆H = −             (Aij − Pij ) −      (Aik − Pik )         (1)
                      2m
                             j|cj =C             k|ci =ck
Fig. 1. Number of communities for each γ in network with dependence measure Pearson
(dotted), Kendall (solid), distance (dashed).


which adds the contribution of the new edges ij and subtracts the contribution
of the old edges ik.
    We begin with each node of the network in a separate community and then we
start the following iterative procedure: for each node i, we compare the variation
of the Hamiltonian (Eq. (1)) caused by moving it to each community in the
graph, and then move it to the one with the smallest negative value. If all
computed values are positive, then the node remains in its community. When all
nodes have been checked, the algorithm starts again at the first node, and stops
when one iteration is completed without any nodes switching communities.


4     Results

In this section we briefly report results from the application of the clustering
method, with each of the three proposed similarity metrics. First a comparison
of the variance of size of clusters through time, and then their performance
through a past financial crisis.


4.1   Comparison of similarity measures

After choosing appropriate values of γ (see section 3), the algorithm has been
applied to each of the networks obtained with daily data from the period 2009-
2015 at monthly time steps (that is, for each month a network is built using daily
returns of the currency exchanges). One variable that presents significant changes
between the networks is the number of nodes within communities. While in the
distance correlation network the resulting communities have similar sizes among
them, those of the Pearson and Kendall correlation present bigger differences.
This can be easily seen by calculating the variances of the number of elements
in communities for each of the three networks at every time step (Figure 2).


Fig. 2. Variance of the size of communities from daily data at monthly time steps, over
a 5 year period . Lines in the plot are for: Pearson (solid); Kendall (dotted); distance
correlation (dashed).


4.2   Exploring short-term community dynamics

We briefly report results from the application of the clustering method, with
each of the three proposed similarity metrics, to one of the events studied in
[7]: the 2005–2008 credit and liquidity crisis, leading to a major reorganization
of communities in the FX network throughout 2007, due to its impact on the
carry trade. The carry trade consists of selling low interest rate currencies (aka
“funding currencies”), such as JPY and CHF, and investing in high interest
rate currencies (or “investment currencies”), such as AUD and NZD. A profit
is made when the interest rate differentials between funding and investment
currencies are not offset by a proportional depreciation of investment currencies.
When there is a decrease in available credit, traders “unwind” their carry-trade
positions, which consist in selling their holdings in investment currencies and
buying funding currencies (cf. [4]).
    We pay special attention to the evolution of communities in the FX network
from the 2007/07/15 - 2007/08/15 to the 2007/08/15 - 2007/09/15 monthly
periods. Analogously to the results of [7], in the network built from the Pearson
linear correlation, we found a big cluster of relevant carry trade currencies (JPY,
AUD, NZD) which at the second time step incorporates more exchanges with
these currencies at one side. In the Kendall correlation networks we observe a
similar behavior; a big carry trade community that gains nodes at the second
time step. In the case of the distance correlation networks, most of the carry
trade currency exchanges are split into two communities, one dominated by
Australian dollar exchanges and the other, smaller in size, by the Japanese yen.
At the second time step, the Australian dollar community looses nodes and ends
up with only AUD exchange rates, while the Japanese yen community grows
and attracts some of the nodes lost by the AUD community. We believe this
last one is a more accurate picture of the unwinding of carry trade positions
occurred in the mid of 2007. Figure 3 in the Appendix show the evolution of
Forex communities applying the distance correlation as similarity metric.
    For a more detailed account of the dynamics of the FX network, and pretty
full coloured plots of the communities evolution, the reader is referred to the
extended report [13].

5     Centrality measures
In this section, we try to determine the role played by individual nodes in the
network using centrality measures.
    The concept of betweenness is based on a distance between nodes and calcu-
lating shortest paths between nodes (where the length of a path is the sum of the
node distances along it). In our network, edge weights Aij represent similarity
between nodes, so the distance dij can be taken as [7]:
                                 
                                   0      if i = j
                           dij =
                                   1/Aij otherwise
                                                                          i
We define Gst as the number of shortest paths from node s to node t, and gst
as the number of shortest paths from s to t passing through i.
Definition 7. The betweenness centrality bi of a node i is:
                                 X X gi
                                            st
                            bi =               .
                                           Gst
                                        s6=i t6=s,i

The betweenness centrality measures how often a given node is in the middle of
shortest paths connecting other nodes, and as a consequence its role in commu-
nications between the network. This makes it relevant when studying how nodes
influence each other and which ones are the most influential.
    Consider Jij = Aij − γ − Pij and its spectral decomposition1 J = U DU T ,
where D is the diagonal matrix of eigenvalues βi and U the corresponding matrix
of eigenvectors. Define q as the number of positive eigenvalues of D.
Definition 8 (cf. [9]). The community centrality of node i is given by the mag-
nitude |x
        pi |, where xi is a node vector of dimension q with j-th element given by
[xi ]j = βj Uij , j ∈ {1, 2, ..., q}.
Note that the community centrality measures the connection of a node to all its
neighbours regardless of their community membership.
1
    We consider the decomposition in which the eigenvalues appear in D in decreasing
    order, so if there are q positive eigenvalues, they will have indices 1, ..., q in D
Results for the 2009-2015 period. We calculate the betweenness and the
community centrality for every node, at every monthly time step over the 2009-
2015 period, and for each of the three networks. Results are shown in Table 1.The
nodes are listed in descending order with respect to their mean betweenness and
to their mean community centrality over the period. To compensate for the fact
that the dimension q implicit in the community centrality of a node can vary
between time steps, we normalize |xi | by its maximum at each time step. For
the Kendall and Pearson networks, betweenness of a currency exchange is equal
to the betweenness of its inverse (because of the symmetry in the networks,
see section 2.2), so for each pair only one representative has been included.
Exchanges with AUD or NZD at one side occupy the highest positions in the
Kendall and Pearson rankings for both centrality measures, while in the distance
correlation network JPY exchanges dominate.


Table 1. Top currency exchanges sorted by average betweenness and average commu-
nity centrality over the 2009-2015 period.

                  betweenness                  comunity centrality
    rank distance Kendall     Pearson distance    Kendall     Pearson
      1 MXN/JPY NZD/AUD NZD/AUD HKD/JPY NZD/USD NZD/USD
      2 SEK/JPY SEK/AUD SEK/AUD JPY/USD HKD/NZD HKD/NZD
      3 NZD/JPY MXN/AUD SEK/NZD CNY/JPY HKD/AUD AUD/USD
      4 SEK/MXN SEK/NZD MXN/AUD HKD/CHF HKD/EUR HKD/AUD
      5 SEK/NZD GBP/EUR NZD/MXN CHF/USD EUR/USD SEK/USD
      6 AUD/JPY SGD/GBP GBP/EUR MXN/CHF AUD/USD NZD/CNY
      7 NZD/CHF SGD/CAD SGD/GBP CNY/CHF SEK/USD HKD/SEK
      8 CAD/JPY CAD/AUD CAD/GBP SGD/SEK HKD/SEK EUR/USD
      9 MXN/CHF NZD/MXN SGD/CAD SGD/CHF CHF/USD HKD/EUR
     10 CHF/AUD CAD/GBP MXN/CAD HKD/SEK HKD/CHF SEK/CNY


6    Conclusions
The proposed similarity metrics, based on Kendall and distance correlations,
applied to the community detection algorithm give some differences in the re-
sulting clusters obtained in [7], which could be explained by their ability to
detect non linear dependences between currency exchange rates. In the case of
the distance correlation the differences are more conspicuous showing clearer
separations (and classification) of investment currencies from funding curren-
cies. On the algorithmic aspect, the clustering method with distance correlation
being able to work with a network of half the size (due to not needing to add the
inverses of exchange rates) greatly reduces the computational cost of running
the algorithm. Indeed, given that the graph is complete, halving the number of
nodes divides the number of edges approximately by four. Since the algorithm
used to minimise the Potts Hamiltonian is roughly linear on the number of edges,
the total computational cost of clustering the distance correlation network is ap-
proximately one fourth compared to that of the Pearson and Kendall correlation
networks. This reduction in computational complexity can be significant when
working with large sets of currencies.

Software used for the data analysis. All the algorithms in this project
were implemented in R [12] with the additional packages igraph[6], Quandl [11],
mcclust[8] and ggplot2 [15].

Acknowledgments. A. Arratia acknowledge support of MINECO project AP-
COM (TIN2014-57226-P), and Gen. Cat. SGR2014-890 (MACDA).


References
 1. Arratia, A.: Computational Finance, An Introductory Course with R. Atlantis
    Press (2014)
 2. Bank for International Settlements, Monetary and Economic Department: Trien-
    nal Central Bank Survey . Foreign exchange turnover in 2013: preliminary global
    results (2013)
 3. Blondel, V.D., Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of com-
    munities in large networks. Journal of Statistical Mechanics: Theory and Experi-
    ment (2008)
 4. Brunnermeier, M.K., Nagel, S., Pedersen, L.H.: Carry trades and currency crashes.
    Tech. rep., National Bureau of Economic Research (2008)
 5. Chen, S., Ma, B., Zhang, K.: On the similarity metric and the distance metric.
    Theoretical Computer Science 410(24), 2365 – 2376 (2009)
 6. Csardi, G., Nepusz, T.: The igraph software package for complex network research.
    InterJournal Complex Systems, 1695 (2006)
 7. Fenn, D.J., Porter, M.A., Mucha, P.J., McDonald, M., Williams, S., Johnson, N.F.,
    Jones, N.S.: Dynamical clustering of exchange rates. Quantitative Finance 12(10),
    1493–1520 (2012)
 8. Fritsch, A.: mcclust: Process an MCMC Sample of Clusterings (2012), r package
    version 1.0
 9. Newman, M.E.J.: Finding community structure in networks using the eigenvectors
    of matrices. Phys. Rev. E 74 (May 2006)
10. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in net-
    works. Phys. Rev. E 69, 026113 (Feb 2004)
11. Quandl: Federal reserve economic data, https://www.quandl.com/data/FRED
12. R Core Team: R: A Language and Environment for Statistical Computing. R Foun-
    dation for Statistical Computing, Vienna, Austria (2015)
13. Renedo-Mirambell, M.: Detecting clusters and their dynamics in the forex market.
    Bachelor Sc. Thesis, Universitat Politécnica de Catalunya (Barcelona Tech) (2016)
14. Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependency by
    correlation of distances. The Annals of Statistics 35(6), 2769–2794 (2007)
15. Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New
    York (2009)
Appendix


       Fig. 3. FX community evolution with distance correlation