Clustering of exchange rates and their dynamics under different dependence measures Martı́ Renedo and Argimiro Arratia Universitat Politècnica de Catalunya, Barcelona, SPAIN marti.renedo@estudiant.upc.edu, argimiro@cs.upc.edu Abstract. This paper proposes an improvement to the method for clus- tering exchange rates given by D. J. Fenn et al, in Quantitative Finance, 12 (10) 2012, pp.1493-1520. To deal with the potentially non linear nature of currency time series dependence, we propose two alternative similarity metrics to use instead of the one used in the aforementioned paper based on Pearson correlation. Our proposed similarity metrics are based upon Kendall and distance correlations. We observe how each of the newly adapted clustering methods respond over several years of currency ex- change data and find significant differences in the resulting clusters. 1 Introduction The foreign exchange market (Forex, or FX) is a decentralized worldwide mar- ket for the trading of currencies. With a global activity of $5.3 trillion per day in 2013, it is the largest market in the world by a significant margin, and con- sequently the most liquid. As opposed to other markets, the foreign exchange market operates 24 hours a day, five days per week (from Sunday 22:00 GMT to Friday 22:00 GMT). This is due to the location of market centres in different time zones. It is worth noting, though, that the largest financial centres (United Kingdom, United States, Singapore and Japan) account for most of the trade (71% in April 2013 [2]). The most traded currency in the foreign exchange market is the US dollar, which was in one side of 87% of trades in April 2013, followed by the euro (33% in April 2013, but having lost share compared to the previous years) and Japanese yen (23% share). Currencies from emerging economies, like the Mexican peso or the Chinese yuan have significantly increased their activity. In this paper we summarise our study and implementation of the clustering algorithm proposed in [7] together with two variants which show potential for improvement on the accuracy of clustering exchange rates series. The clustering method consists of applying generalised modularity minimisation techniques to a weighted similarity graph obtained from the Pearson correlation between the time series of returns of the exchange rates. However, this method will fail to detect similarities between exchange rate returns that are not linearly correlated, which is more often than not the case for financial time series. We apply the clustering algorithm to similarity graphs built upon the Kendall ranks and the distance correlations, which do detect non linear dependence, and analyse the differences in the results. 2 The Forex network Given the price time series of a currency exchange rate, we consider its log returns series to analyse its behaviour. Definition 1. The log return (or continuously compounded return) rt of an as- set of price Pt over a time step [t − 1, t] is defined as: Pt rt = ln = ln Pt − ln Pt−1 Pt−1 Once all time series of returns have been obtained, for each pair of cur- rencies a measure of similarity between them is calculated. The Forex network will be built with every exchange rate return series as a node, and edges be- tween exchange-nodes weighted accordingly to their similarity. This results in a weighted undirected complete graph. 2.1 Choice of Currencies The currencies included in our study will be the 13 of the most traded as of April 2013 [2], and for which data is available from the U.S. Federal Reserve Economic Data. These are: US dollar (USD), euro (EUR), yen (JPY), pound sterling (GBP), Australian dollar (AUD), Swiss franc (CHF), Canadian dollar (CAD), Mexican peso (MXN), Chinese yuan (CNY), New Zealand dollar (NZD), Swedish krona (SEK), Hong Kong dollar (HKD) and Singapore dollar (SGD). Thus, our Forex network consists of n = 78 nodes, each node representing a pairwise exchange rate taken from our selected set of currencies. While it would be possible to just use one base currency and assign one node to every other currency corresponding to its exchange rate over the base, this could overlook interactions between some of the currencies. Moreover, the resulting network would strongly depend on the choice of base currency. Our construction (which follows [7]) prevents those issues and introduces all the interactions between every pair of the selected currencies into the analysis. However, since it is difficult to find data for the exchange rate of every pair of currencies, we downloaded all exchange rates with the US dollar, and the other exchange rates are obtained by converting through the US dollar as an auxiliary step. That is, to obtain the exchange rate between currencies XXX and Y Y Y where neither of them are U SD, we define XXX/Y Y Y = XXX/U SD Y Y Y /U SD . This procedure is also implemented in [7]. While it is possible to obtain data for the exchange rates of most currency pairs, the fact that markets around the world have different opening hours and holidays would mean that the times at which data was taken would most likely present mismatches between all pairs. Taking only data from the US Federal Reserve limits the study to the New York opening hours but eliminates any potential inconsistencies and makes calculating correlations between pairs more accurate. 2.2 Similarity Metrics To be able to form clusters of similar exchange rates, the edges of the network will have weights assigned according to a similarity measure. In this context, similarity is a metric built from a statistic measure of dependence, and which verifies the properties described in [5]. We analyse three measures of dependence from which we can construct a similarity metric. Pearson correlation. The approachp used in [7] is based on the Pearson cor- i j i j relation ρ(r , r ) = Cov(r , r )/ Var(r )Var(rj ) of returns of exchange rates ri i and rj over the given time interval. Then, the weighted similarity matrix among the exchange rate returns Aρ is given by 1 Aρij = (ρ(ri , rj ) + 1) − δij 2 which scales the Pearson correlation from [−1, 1] to [0, 1], while the Kronecker delta δij removes self-edges. In the graph with adjacency matrix Aρ exchange rates with positively linearly correlated returns will be connected by edges of weight close to 1, and weight near 0 if the correlation is negative. Edges con- necting non correlated exchanges will have weights closer to the center of [0, 1]. Kendall rank correlation. A correlation measure alternative to Pearson’s is the Kendall correlation (or rank correlation) τ [1], which measures concordance of the variables instead of linear relations between them: Definition 2. Given two random variables X and Y, their Kendall correlation coefficient is τ (X, Y ) = pc − pd , where for any two independent pairs of values (Xi , Yi ), (Xj , Yj ), pc = P ((Xj −Xi )(Yj −Yi ) > 0) and pd = P ((Xj −Xi )(Yj − Yi ) < 0), are the probabilities of these pairs being concordant and discordant respectively. Definition 3. For two series of returns ri , rj over m time steps, the Kendall correlation is estimated by X sgn(rti − rsi ) − sgn(rtj − rsj ) τm (ri , rj ) = n(n − 1) 1≤s 0 Rn (X, Y ) = 2 (Y ) Vn (X)Vn 0, Vn2 (X)Vn2 (Y ) = 0 Remark 1. It should be observed that the distance correlation is not a distance in the metric sense. It is by definition a correlation among all distances of the samples, and one can easily see that R(X, X) = 1, violating the identity of indiscernibles of a distance. The distance correlation is in fact a similarity metric. Using R for studying currency exchange rates the following property is sat- isfied: Given XXX/Y Y Y , ZZZ/T T T exchange rates,     XXX ZZZ XXX T T T R , =R , . Y Y Y TTT Y Y Y ZZZ This implies that there is no need to include the inverses of the exchange rates (as in [7], where it is needed or else some correlations could be overlooked), since the correlations detected will be the same for XXX/Y Y Y and Y Y Y /XXX. Since R already has the properties of a similarity metric, the exchange rate network is simply built from the matrix of distance correlations, AR , removing self edges; that is, for each pair of exchange rate returns ri , rj , AR i j ij = R(r , r ) − δij 3 Community detection To partition the graph into communities we implemented Potts method. This partition method consists on minimising an objective function, the Potts Hamil- tonian, which evaluates the strength of a partition of the graph. (Considering a strong partition as one that has strong links -heavy weighted- inside the commu- nities and weak links between communities.) This can be seen as a generalisation of the modularity function [10]. Definition 6. (i) The modularity of the partition P of a weighted undirected graph with adjacency matrix A is given by 1 X Q(P) = [Aij − Pij ]δ(ci , cj ) 2m ij where ci is the community of the node i in the partition P (so δ(ci , cj ) is 1 when i and j are in the same community and 0 otherwise), Pij is the expected weight of the edge ij in a null model and m is the sum of the weights of all edges in the graph. (ii) The Hamiltonian of the Potts system of the partition P of a weighted undi- rected graph with adjacency matrix A is given by X H(P) = − [Aij − γPij ]δ(ci , cj ) ij where γ is a parameter. 1 Note that when γ = 1, Q(P) = − 2m H(P), so in this case the problem of minimising the Potts Hamiltonian is equivalent to the maximisation of the modularity. This means the Potts method is a generalisation of the modularity maximisation method. The advantage of this generalisation is that by varying the value of γ the partition that minimises the Hamiltonian will contain big- ger or smaller communities (corresponding with lower and higher values of γ respectively). 3.1 Selecting the null model P The modularity and Potts Hamiltonian functions depend on null models to es- timate the expected edge weight, and then form communities where the actual weight is significantly larger. In the context of community identification in net- works, a popular choice of null model is the Newman-Girvan model [10]. For the networks Aρij and Aτij , the Newman-Girvan null model results in ( l A·il )( l A·jl ) P P · n−2 Pij = P · = i,j Aij 2n a constant value for all edges. This happens because, for a given node, the two edges connecting it to any other exchange rate and its inverse will have sum 1. Then, the weight of a node (i.e., the sum of the weights of its incident edges) will depend only on the size of the network and not on the correlations between exchange rates at a given time step. However, this property is generally not satisfied by the Newman-Girvan null model of the network AR ij . For simplicity, we can use the average edge weight as a uniform null model: P R i,j Aij P = n(n − 1) 3.2 Selecting the value of γ To select an appropriate value of γ for the study, we use a sample network built from daily data of January 2010 at a single time step for each of the similarity metrics. The minimisation algorithm is applied for values of γ ranging from 0.4 (all nodes are in a single community) to 2.2 (each node forms its own community) in 0.01 steps. The total number of communities for each value of γ, with respect to each of the three dependence measures considered, are shown in Figure 1. We can expect to obtain more robust communities by selecting a value of gamma inside an interval where the number of communities stabilizes (refered to as plateaus in [7]). Excluding the trivial cases at the extremes of the plot with just one community containing all nodes and every node in a single community, the widest non trivial plateaus are the intervals ∼ (1.48, 1.51), ∼ (1.27, 1.33) and ∼ (1.44, 1.52) for the Pearson, Kendall and distance correlations respectively. We then select values of gamma γρ = 1.495, γτ = 1.3 and γR = 1.48. 3.3 Potts Hamiltonian minimisation algorithm The algorithm used to minimise the Potts Hamiltonian is an adapted version of the modularity maximisation algorithm in [3]. For a given node, we want to see if moving it to another community would give an overall decrease in the Hamiltonian. The change in the Hamiltonian caused by moving the node i to a community C is given by the expression:   1  X X ∆H = − (Aij − Pij ) − (Aik − Pik ) (1) 2m j|cj =C k|ci =ck Fig. 1. Number of communities for each γ in network with dependence measure Pearson (dotted), Kendall (solid), distance (dashed). which adds the contribution of the new edges ij and subtracts the contribution of the old edges ik. We begin with each node of the network in a separate community and then we start the following iterative procedure: for each node i, we compare the variation of the Hamiltonian (Eq. (1)) caused by moving it to each community in the graph, and then move it to the one with the smallest negative value. If all computed values are positive, then the node remains in its community. When all nodes have been checked, the algorithm starts again at the first node, and stops when one iteration is completed without any nodes switching communities. 4 Results In this section we briefly report results from the application of the clustering method, with each of the three proposed similarity metrics. First a comparison of the variance of size of clusters through time, and then their performance through a past financial crisis. 4.1 Comparison of similarity measures After choosing appropriate values of γ (see section 3), the algorithm has been applied to each of the networks obtained with daily data from the period 2009- 2015 at monthly time steps (that is, for each month a network is built using daily returns of the currency exchanges). One variable that presents significant changes between the networks is the number of nodes within communities. While in the distance correlation network the resulting communities have similar sizes among them, those of the Pearson and Kendall correlation present bigger differences. This can be easily seen by calculating the variances of the number of elements in communities for each of the three networks at every time step (Figure 2). Fig. 2. Variance of the size of communities from daily data at monthly time steps, over a 5 year period . Lines in the plot are for: Pearson (solid); Kendall (dotted); distance correlation (dashed). 4.2 Exploring short-term community dynamics We briefly report results from the application of the clustering method, with each of the three proposed similarity metrics, to one of the events studied in [7]: the 2005–2008 credit and liquidity crisis, leading to a major reorganization of communities in the FX network throughout 2007, due to its impact on the carry trade. The carry trade consists of selling low interest rate currencies (aka “funding currencies”), such as JPY and CHF, and investing in high interest rate currencies (or “investment currencies”), such as AUD and NZD. A profit is made when the interest rate differentials between funding and investment currencies are not offset by a proportional depreciation of investment currencies. When there is a decrease in available credit, traders “unwind” their carry-trade positions, which consist in selling their holdings in investment currencies and buying funding currencies (cf. [4]). We pay special attention to the evolution of communities in the FX network from the 2007/07/15 - 2007/08/15 to the 2007/08/15 - 2007/09/15 monthly periods. Analogously to the results of [7], in the network built from the Pearson linear correlation, we found a big cluster of relevant carry trade currencies (JPY, AUD, NZD) which at the second time step incorporates more exchanges with these currencies at one side. In the Kendall correlation networks we observe a similar behavior; a big carry trade community that gains nodes at the second time step. In the case of the distance correlation networks, most of the carry trade currency exchanges are split into two communities, one dominated by Australian dollar exchanges and the other, smaller in size, by the Japanese yen. At the second time step, the Australian dollar community looses nodes and ends up with only AUD exchange rates, while the Japanese yen community grows and attracts some of the nodes lost by the AUD community. We believe this last one is a more accurate picture of the unwinding of carry trade positions occurred in the mid of 2007. Figure 3 in the Appendix show the evolution of Forex communities applying the distance correlation as similarity metric. For a more detailed account of the dynamics of the FX network, and pretty full coloured plots of the communities evolution, the reader is referred to the extended report [13]. 5 Centrality measures In this section, we try to determine the role played by individual nodes in the network using centrality measures. The concept of betweenness is based on a distance between nodes and calcu- lating shortest paths between nodes (where the length of a path is the sum of the node distances along it). In our network, edge weights Aij represent similarity between nodes, so the distance dij can be taken as [7]:  0 if i = j dij = 1/Aij otherwise i We define Gst as the number of shortest paths from node s to node t, and gst as the number of shortest paths from s to t passing through i. Definition 7. The betweenness centrality bi of a node i is: X X gi st bi = . Gst s6=i t6=s,i The betweenness centrality measures how often a given node is in the middle of shortest paths connecting other nodes, and as a consequence its role in commu- nications between the network. This makes it relevant when studying how nodes influence each other and which ones are the most influential. Consider Jij = Aij − γ − Pij and its spectral decomposition1 J = U DU T , where D is the diagonal matrix of eigenvalues βi and U the corresponding matrix of eigenvectors. Define q as the number of positive eigenvalues of D. Definition 8 (cf. [9]). The community centrality of node i is given by the mag- nitude |x pi |, where xi is a node vector of dimension q with j-th element given by [xi ]j = βj Uij , j ∈ {1, 2, ..., q}. Note that the community centrality measures the connection of a node to all its neighbours regardless of their community membership. 1 We consider the decomposition in which the eigenvalues appear in D in decreasing order, so if there are q positive eigenvalues, they will have indices 1, ..., q in D Results for the 2009-2015 period. We calculate the betweenness and the community centrality for every node, at every monthly time step over the 2009- 2015 period, and for each of the three networks. Results are shown in Table 1.The nodes are listed in descending order with respect to their mean betweenness and to their mean community centrality over the period. To compensate for the fact that the dimension q implicit in the community centrality of a node can vary between time steps, we normalize |xi | by its maximum at each time step. For the Kendall and Pearson networks, betweenness of a currency exchange is equal to the betweenness of its inverse (because of the symmetry in the networks, see section 2.2), so for each pair only one representative has been included. Exchanges with AUD or NZD at one side occupy the highest positions in the Kendall and Pearson rankings for both centrality measures, while in the distance correlation network JPY exchanges dominate. Table 1. Top currency exchanges sorted by average betweenness and average commu- nity centrality over the 2009-2015 period. betweenness comunity centrality rank distance Kendall Pearson distance Kendall Pearson 1 MXN/JPY NZD/AUD NZD/AUD HKD/JPY NZD/USD NZD/USD 2 SEK/JPY SEK/AUD SEK/AUD JPY/USD HKD/NZD HKD/NZD 3 NZD/JPY MXN/AUD SEK/NZD CNY/JPY HKD/AUD AUD/USD 4 SEK/MXN SEK/NZD MXN/AUD HKD/CHF HKD/EUR HKD/AUD 5 SEK/NZD GBP/EUR NZD/MXN CHF/USD EUR/USD SEK/USD 6 AUD/JPY SGD/GBP GBP/EUR MXN/CHF AUD/USD NZD/CNY 7 NZD/CHF SGD/CAD SGD/GBP CNY/CHF SEK/USD HKD/SEK 8 CAD/JPY CAD/AUD CAD/GBP SGD/SEK HKD/SEK EUR/USD 9 MXN/CHF NZD/MXN SGD/CAD SGD/CHF CHF/USD HKD/EUR 10 CHF/AUD CAD/GBP MXN/CAD HKD/SEK HKD/CHF SEK/CNY 6 Conclusions The proposed similarity metrics, based on Kendall and distance correlations, applied to the community detection algorithm give some differences in the re- sulting clusters obtained in [7], which could be explained by their ability to detect non linear dependences between currency exchange rates. In the case of the distance correlation the differences are more conspicuous showing clearer separations (and classification) of investment currencies from funding curren- cies. On the algorithmic aspect, the clustering method with distance correlation being able to work with a network of half the size (due to not needing to add the inverses of exchange rates) greatly reduces the computational cost of running the algorithm. Indeed, given that the graph is complete, halving the number of nodes divides the number of edges approximately by four. Since the algorithm used to minimise the Potts Hamiltonian is roughly linear on the number of edges, the total computational cost of clustering the distance correlation network is ap- proximately one fourth compared to that of the Pearson and Kendall correlation networks. This reduction in computational complexity can be significant when working with large sets of currencies. Software used for the data analysis. All the algorithms in this project were implemented in R [12] with the additional packages igraph[6], Quandl [11], mcclust[8] and ggplot2 [15]. Acknowledgments. A. Arratia acknowledge support of MINECO project AP- COM (TIN2014-57226-P), and Gen. Cat. SGR2014-890 (MACDA). References 1. Arratia, A.: Computational Finance, An Introductory Course with R. Atlantis Press (2014) 2. Bank for International Settlements, Monetary and Economic Department: Trien- nal Central Bank Survey . Foreign exchange turnover in 2013: preliminary global results (2013) 3. Blondel, V.D., Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of com- munities in large networks. Journal of Statistical Mechanics: Theory and Experi- ment (2008) 4. Brunnermeier, M.K., Nagel, S., Pedersen, L.H.: Carry trades and currency crashes. Tech. rep., National Bureau of Economic Research (2008) 5. Chen, S., Ma, B., Zhang, K.: On the similarity metric and the distance metric. Theoretical Computer Science 410(24), 2365 – 2376 (2009) 6. Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Systems, 1695 (2006) 7. Fenn, D.J., Porter, M.A., Mucha, P.J., McDonald, M., Williams, S., Johnson, N.F., Jones, N.S.: Dynamical clustering of exchange rates. Quantitative Finance 12(10), 1493–1520 (2012) 8. Fritsch, A.: mcclust: Process an MCMC Sample of Clusterings (2012), r package version 1.0 9. Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74 (May 2006) 10. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in net- works. Phys. Rev. E 69, 026113 (Feb 2004) 11. Quandl: Federal reserve economic data, https://www.quandl.com/data/FRED 12. R Core Team: R: A Language and Environment for Statistical Computing. R Foun- dation for Statistical Computing, Vienna, Austria (2015) 13. Renedo-Mirambell, M.: Detecting clusters and their dynamics in the forex market. Bachelor Sc. Thesis, Universitat Politécnica de Catalunya (Barcelona Tech) (2016) 14. Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependency by correlation of distances. The Annals of Statistics 35(6), 2769–2794 (2007) 15. Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York (2009) Appendix Fig. 3. FX community evolution with distance correlation