1. Introduction

Towards Extracting Causal Graph Structures from Trade Data and Smart Financial Portfolio Risk Management

Ployplearn Ravivanpong

Till Riedel

Pascal Stock

0 0 Frankfurt School of Finance and Management , Adickesallee 32-34, 60322 Frankfurt am Main , Germany 1 Karlsruhe Institute of Technology , Kaiserstraße 12, 76131 Karlsruhe , Germany

Risk managers of asset management companies monitor portfolio risk metrics such as the Value at Risk in order to analyze and to communicate the risks timely to portfolio managers, and to ensure regulatory compliance. They must investigate the possible causes if a portfolio risk significantly increases or breaches a regulatory limit. However, monitoring can quickly become overwhelming, time and labor-intensive as each risk manager has to deal with over a hundred portfolios, numerous daily market data, and hundreds of risk factors of the supervised portfolios and of their securities. Particularly, understanding the interrelations between incidents in diferent portfolios beyond high level indicators is important. However, analyzing these interrelations manually is one of the most dificult tasks. In this paper, we describe and demonstrate how automatically generating causal graphs can address the capacity problem of practitioners in risk management, who are facing more and more capital markets based risk data daily on the portfolio level alone. Based on a proof of concept implementation, we compare a pairwise causal-inference-based approach with a clustering-based construction approach. We discuss the advantages and disadvantages of both approaches, both computationally and based on the resulting structure. Based on our initial findings, we outline further challenges and research topics.

eol>risk management causal inference agglomerative hierarchical clustering network visualization

1. Introduction When allocating a part of the portfolio to assets with a

lower risk profile, e.g. German government bonds, portDespite the fact that the financial risk management do- folio managers take care that these assets are minimally main is already matured with regard to employed econo- correlated to those assets in the portfolio with high risks, metric models, is well regulated and requires high trans- e.g. emerging market stocks and bonds. Yet a causal relaparency, we believe that it can benefit from machine tionship, undetected by traditional methods or unaware learning to improve daily portfolio risk management. A of by risk and portfolio managers, between underlying asrisk manager in an asset management company must sets can unexpectedly lead to a high correlation between maintain an overview of 100 to 250 portfolios daily. Each several portfolios of diferent types during a small crisis. of these portfolios includes an aggregate of over 160 risk In one instance the VaR ratio of an energy portfolio and a factors. The most important risk factors are equity risk South East Asian (SEA) small-market-capitalization portfactors for stocks, interest rates and yield-curve risk fac- folio increased significantly. To trace the possible causes, tors for fixed-income securities, and foreign exchange a risk manager examined the marginal losses of assets in risk factors. Without modern risk management software both portfolios. The stocks of the state-own Venezuelan and aggregate risk measures like the Value-at-Risk (VaR) oil company were found to be the cause for the increased that is required by the European and recently US regu- VaR ratio of the energy portfolio. Only through examinlation, it would be impossible for one risk manager to ing model parameters, it was found out two days after manage and analyze over 16,000 portfolio risk factors the incident that nine SEA companies supplied machines manually on a daily basis. Nonetheless, if there is a regu- to the Venezuelan oil company. Both portfolios indirectly latory risk limit breach, a deep causal analysis into port- sufered from the US-embargo on the Venezuelan govfolios is necessary to mitigate risk and potential losses. ernment. Any technology that helps to uncover such a Such analyses are time intensive and requires manually “causal mechanism” will significantly reduce the immense examining on average 98 securities, each having over 160 time needed to complete the analysis. risk factors. An intelligent automation of the analyses is Such root cause analysis of a change in risk profile thus needed to enable more eficient root cause analyses. or limit breach, can be addressed by employing causal graph, graph visualization, and graph anomaly detection.

If causal relationships between assets and market factors can be reliably derived, we can monitor their dynamics and anomalous development using network analysis and visualization to obtain an overview across all portfolios, Published in the Workshop Proceedings of the EDBT/ICDT 2022 Joint Conference (March 29-April 1, 2022), Edinburgh, UK $ ployplearn.ravivanpong@kit.edu (P. Ravivanpong); till.riedel@kit.edu (T. Riedel); p.stock@fs.de (P. Stock)

© 2022 Copyright for this paper by its authors. Use permitted under Creative CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g CCoEmmUoRns LWiceonsrekAstthribouptionP4r.0oIncteerenadtiionnagl s(CC(CBYE4U.0)R.-WS.org) even though the network may only suggest the causal ear statistical dependency, the presence of non-linear chains. dependency has also been detected [6][7]. A cyclical rela

Within an exploratory research project with a Ger- tionship or a direction switch due to structural changes, man asset management company, we applied existing between economic factors has also been documented [8]. methods to derive a causal graph of securities based on a The securities that are not publicly traded, also known pairwise analysis of a subset of portfolios. Our prelimi- as over-the-counter (OTCs), can be viewed as potential nary results, however, did not meet the user requirements. unmeasured confounders in the data. Yet, information Consequently, we employed agglomerative hierarchical on the current condition of unobserved companies may clustering (AHC) of portfolio risk profiles with network be reflected in the market data and the prices of publicly visualization as a simpler and more practical alterna- traded securities; because portfolio managers incorporate tive. In this paper, we describe application requirements, them in their the asset allocation strategies and trading our approaches to the technical as well as requirement behavior [2]. Assuming at least semi-strong market efchallenges, and demonstrate the use of AHC and net- ficiency thus allow us to assume causal suficiency to a work visualization to support investment portfolio risk certain extent. We do not take contemporaneous relamanagement in practice. We discuss the advantages and tionship into account yet, in order to avoid noises creshortcomings of the approaches, and finally outline our ated by hyper-traders and short-live panic trade-behavior. perspective on further research topics for the presented Keeping these conditions in mind, the suitable causal application. discovery methods must be capable of deriving causal relationships from a large number of time series, are non-parametric and do not strictly impose acyclicity. 2. Background and Related Work Considering the above assumptions together with the technical requirements from the risk management, existing causal inference methods3 that are readily applicable, satisfy the majority of the conditions, and theoretically scalable are Efective Transfer Entropy (ETE) [ 10] and Peter-Clark Momentary Conditional Independence (PCMCI) method [11].

A causal graph or a causal diagram is a directed graph that visualizes causal relationships between variables. A node represents a variable. A directed edge →− means that is the direct cause of , i.e. changing will result in the change in , all else being equal [1]. To understand the interaction between market actors and risk factors, the information reflecting the real activity of companies (e.g. net cash-flow) should be used to derive 3. Data and Requirements the causal graph1. Since it is infeasible to acquire and curate structured data of such information, trade data The choice of approaches depends not only on the availare usually used as a proxy, assuming that those infor- able data but also on the technical and user requirements. mation are reflected in security prices according to the The available data for our study are the internal portfolio eficient market hypotheses [ 2]. Typically, a 1-day return risk data and the Deutsche Börse Public Dataset (PDS). = − 1 − 1 is used, which is the percentage change of The internal data consist of portfolio metadata, such as today’s (closing) price from the previous trading day. In the ISIN, class of the assets, and their aggregate portfolio this case, a node in the resulting causal graph presents a risk measures (e.g. UCITS gross exposure, present valsecurity or an economic factor, such as the interest rate or ues) and risk factors (e.g. change in oil prices, interest the oil price. An edge →− can be viewed generally rates). The portfolio risk measures data is longitudinal. as the Granger causal direction, i.e. contains informa- Each portfolio has several time series of risk measures tion to forecast [3]. Risk managers can trace the paths and factors. Since portfolios contain diferent asset types, of the information flow to the security in question and like only stocks or bonds, or a mixture of them, some use them as a basis to determine the actual causation. risk measures do not exist for all portfolios, resulting

The choice of an algorithm to derive a causal graph in missing values. The PDS consists of the initial price, from time series depends on the assumptions2 that can lowest price, highest price, final price, and volume of all be made about the underlying causal structure: linear securities traded on the Eurex and Xetra trading systems vs. non-linear dependency, acyclicity, causal suficiency aggregated in minute-interval [12]. The internal data set (no unmeasured confounder [5]), and contemporaneous can be linked with the PDS using the asset ISIN. Howrelationship. Although risk models typically assume lin- ever, the investigated portfolios also consist of currency, 1The graph is also known as a financial network.

2Since we assume the structural causal model exists in our case, we also assume Markov condition. Additionally, we also assume faithfulness, allowing us to infer dependencies from the resulting graph [4].

3We initially also considered Temporal Causal Discovery Frame

work (TCDF) [9] because of its ability to detect the presence of hidden variables (which is beneficial yet not strictly necessary). However, the method was rejected due to its use of a complex black-box Attentive Convolutional Neural Network.

OTCs, and derivatives. These assets cannot be matched The major drawback of the ETE is the number of hyperwith the available trade data. Since the internal data are parameters to consider. This includes the number of lags, aggregated on a daily basis, only the daily closing price the discretization method, the Rényi entropy weighting of the trade data is relevant. parameter, the number of shufles, and the number of

Risk management requires a combination of high trans- bootstraps for statistical inference. A sensitivity analysis parency, explainability, and timely detection as well as can be performed to get a robust result, but is computaaction. Transparency and explainability are required, tionally expensive. Also, it does not address the hidden not only for external communication to institutional in- variable problem and eventually sufers from the curse of vestors and the regulator, like the German BaFin, but also dimensionality as the number of variables and lags gets for internal communication to portfolio managers. This larger [18]. Nevertheless, the ETE is chosen because the currently means that all the variables that enter a model concept can be interpreted as a non-parametric Granger and the underlying assumptions (e.g. for proxy variables) causality and its application is accepted in the financial must be known, while the reasons they are used and their domain. efects on the models must be understood. Furthermore, The PCMCI is designed under the causal discovery the selected methods must enable a risk manager to de- framework and with parallelization in mind. It uses the rive and describe the mechanism of risk development to PC algorithm to quickly first identify potential causes stakeholders and regulators. Hence, black-box models of an interested variable and prunes them using a conare rejected by default unless there is an acceptable justi- ditional independence test. As a result, the algorithms ifcation while generated features must have economic or can detect small efects, given a significance level. It also statistical meanings. has fewer hyperparameters to consider. Apart from the

Given that it usually takes 48 hours for senior risk man- number of lags and the number of variable combinations agers to manually find the root cause of a risk limit breach in the conditional independence test, users have only to and that the complex traditional risk model calculation specify the significance level to limit the false positive is finished overnight, the selected algorithm run-time rates. By using the CMI non-parametric test, the method must ideally be within 12 hours on the available compute can identify both linear and non-linear dependencies [11]. resources (we assumed a workstation with 24 CPU cores Despite the algorithm assumptions of no hidden variable, as a realistic price tag). acyclicity, and stationary time series, it is selected for our preliminary study because it is based on conditional independence test, a concept familiar in finance, can output a temporal causal graph, and is parallelizable.

4. Deriving Causal Graph of Underlying Assets

We initially chose Efective Transfer Entropy (ETE) to build the graph in our experiments. Transfer entropy (TE) is an information-theoretic measure that quantifies the information flow between two time series, while ignoring their static correlations due to common cause[13]. The method is non-parametric and thus can detect both linear and non-linear statistical dependencies. Moreover, it is well researched and widely applied in causality learning from financial data [ 6][14] [15][16][17]. We use the Rényi entropy as a basis for the TE because it uses a weighting parameter, which allows us to focus on diferent areas of distribution [10]. The ETE is the TE of the pair of original time series adjusted by the TE of the shufled data. This enables the ETE method to be able to detect small efects, as a consequence of limited data and many variables, such is often the case of financial time series [ 6]. A significance test is performed on ETE of each security pair in each direction. Only edges with positive ETE and p-value less than a pre-defined threshold (e.g. 5%) are kept. The result is a causal graph4.

4[4] cautions that the size of TE should not be viewed as quan

titative causal strength. Therefore, it can be interpreted at best as stronger or weaker dependencies, analogous to the interpretation of correlation.

We applied both ETE and PCMCI on the 1-day returns of securities in the portfolios. Missing values due to nontrading activity are filled with the last closing price for simplicity. Seldom traded securities, whose data contain mostly missing values, are removed. The results were of mixed success. First, we encountered a computational that the subgroups and their re-formation, given diferchallenge when applying both methods using their origi- ent market situations, might expose common risk drivers, nal implementation. Although both methods have been without having to analyze the underlying assets. applied to financial data as a proof of concept, none of For an exploratory study to uncover an alternative the experiments involve over seven hundred time series. grouping of portfolios, a suitable clustering method Even though we managed to improve the parallelization must not require users to specify the number of clusof ETE, it still took almost six hours on 24 CPUs and 72 ters. Among them, agglomerative hierarchical clustering GB RAM for analyzing roughly 700 time-series of over (AHC) is the most practical. The nearest-neighbor chain two years daily returns of the investigated securities that algorithm, which is available in most statistical software, are part of the given sample of 50 portfolios. The long is fast and always yields the same clustering if none of run time for ETE was due to shufling and bootstrap- the instance pairs have the same minimum distance.The ping in combination with the pairwise test. The CMI, clustering result can be visualized as a dendrogram, rewhich is a fully non-parametric test, in the second part gardless of the number of features, showing the class of PCMCI has a long run time and does not scale well for hierarchy. Due to these advantages, especially for the a high number of time series. With the computational exploration of new categorization, we select the AHC for resources, that could be realistically be made available our application. at the time, it could not be ensured that the run time will not significantly exceed 12 hours when analyzing additional thousands of assets and economic factors. As the algorithm is subject to quadratic scaling, we saw it as impractical given that we only had analyzed a relatively small subset. Also, the resulting causal graph, in which each asset is presented as a node, is complex to interpret both manually and automatically. An accuracy evaluation was infeasible without access to knowledge of existing risk models.

5. Agglomerative Hierarchical Clustering of Risk Profiles Due to the complexities of asset-level causal inference as

a basis for identifying risk drivers and their interrelationship, it makes sense to look for alternative approaches.

One major requirement should be that such an approach: (i) intuitively shows the relationship to the monitored risk indicators; (ii) can be eficiently calculated automatically on a continuous basis; and (iii) creates structures that expose causal relationships beyond already established classification schemes.

In contrast to an eager bottom-up approach as presented above, an alternative might lie in a top-down approach that examines the indicator in question. The interrelationship between diferent portfolios should already be captured in the risk models, and thus be observable at the indicator level. Indicators have the advantage that they are already normalized and are designed for comparison. However, they are typically not used to structure information. This can be done by employing clustering analysis on the time series created by those indicators. While measuring distance between totally different kinds of portfolios in all market situations might be of limited value, subspace clustering approaches have the advantage that they can build hierarchies based on local similarities within subgroups. Our assumption is

The AHC method is originally developed for crosssectional data. The adaption for longitudinal data clustering is a three-step decision process. First, one decides if the clustering should be performed along the time axis (whole time series clustering or rolling window clustering) or on all features, at each time point while ignoring the time axis (over-time clustering). The second step is to choose the similarity measure. Four approaches are usually found in practice [19][20]: shape-based, featurebased, model-based, and compression-based. In the last step, one chooses the clustering algorithm (linkage method). Since model-based and compression-based approaches generate complex features and transparency plays a major role in our case, we are limited to shapebased and feature-based similarity measures. Figure 2 summarizes the two main approaches for our use-case.

Each of them presents diferent views about risk similarity. 5.1. Rolling window clustering Rolling window clustering is a variation of whole time series clustering, which groups a set of similar time series together [19]. Since the risk and market indicators vestors and private wealth management was also among are updated almost daily and their previous year values the cluster members. Such a close association between should contribute little to their current trend, we only a defensive multi-asset fund (invested only 30% global need to cluster the recent values and update the cluster- stocks and 70% in bonds) and several German equity flaging with rolling windows. Initially, both shaped-based ship funds (100% German stocks), was not obvious at first and feature-based approaches are under consideration. glance. A look into their shared risk factors within their Even though the feature-based approach is more robust cluster, which took a few minutes, showed the newly against missing values and is suitable for multivariate listed stock of Siemens Healthineers was the common time series, it was also found during the preliminary anal- cause; because it experienced an increase in volatility on ysis that the generated features add more interpretation its first trading days and all the portfolios were heavily dificulty and reduces quick communication. Fortunately, invested in shares of the Siemens AG, the parent comwe can avoid the multivariate time series conversion and pany. Such analysis that would have taken over an hour the missing value problems in the shape-based approach or more, even though the IPO of the spin-of was well by using the VaR exposure ratio. The VaR exposure ratio known, became visible and obvious in a few minutes for is calculated from all relevant risk factors of a portfolio the risk managers using the AHC and network visualizaand scaled by its reference portfolio. Due to the strict tion. regulatory requirements laid down in the German Derivative Regulation, it must be calculated for all regulated 5.2. Over-time clustering portfolios and may not contain missing values. Hence, the shape-based approach is preferred. The Ward’s link- The result of the rolling window clustering is afected by age method is selected for its ability to identify distinct the window size. Besides, by using only one aggregated clusters. Consequently, we are limited to the Euclidean indicator, the method ignores additional information that distance. may be contained in other risk measures. In order to take into account multiple variables while keeping feature engineering at a minimum, one can treat the data at each time point as cross-sectional data and perform a cluster analysis on them. The approach is known as over-time clustering [21]. This means we make an assumption that all the data at each time point already contain all information from its lags, allowing us to ignore the time feature in the clustering. By keeping a fixed set of variables to be clustered, one can analyze the cluster dynamics over time and thus identify anomalous development [21].

An example of the rolling window clustering of port

folios based on their VaR exposure ratio of the last 90 trading days results in groups of portfolios as shown in Figure 3. The network visualization is created by applying the minimum spanning tree (MST) algorithm to the cophenetic distance matrix [20]. The force-based Fruchterman-Rhiengold layout is the best suited for visualization as it put cluster members as to each other while separating the clusters from one another as much as possible. The trafic light color scheme corresponding to risk limit breach frequency helps risk managers to spot high-risk portfolios. A qualitative evaluation shows Figure 4: The overall distance at each time point (top), its that portfolios of the same theoretical type are correctly daily percentage change (middle) and the VSTOXX and VIX grouped together, but with some exceptions. For exam- volatility indices (bottom). ple, the German equity portfolios lied close together in a cluster as expected. Yet, one of the portfolio manager’s We applied the overtime clustering using Euclidean most important defensive multi-asset fund for retail in- distance and Ward’s linkage on around 30 risk measures. Since we are interested in the dynamics, only variables which is based on risk models that assume linear dethat exist for all portfolios that seldom have missing val- pendency and are handcrafted from expert knowledge. ues, excluding the VaR exposure ratio, are used. Figure Thus the AHC only summarizes existing information and 4 compares the final Ward’s linkage distance or overall presents them in a more structured way to risk and portdistance at each time point (top plot) to the VSTOXX and folio managers. As such, it still misses signals that experts VIX volatility indices (bottom plot). They measure the im- are unaware of and little new knowledge is gained. Since plied volatility of the stocks in the EuroSTOXX50 index the resulting network is undirected, the AHC is unable and the S&P500 index. These indices are commonly used to suggest the possible causal paths that could further as indicators of the overall stock market volatility, and reduce deep dive analysis efort should a risk limit breach thus the implied risk. The lower overall distance reflects occur. It is also debatable how early the AHC can visuthe higher similarity between portfolio risk profiles, indi- alize the weak efect of some common causes between cating that some common risk factors are exerting their portfolios before the efect becomes apparent. The AHC influence across the board. A steep decrease in the overall may be a promising and practical solution in the medium distance is found to coincide with high market volatility. term. But a practical causal graph is desirable in the long Our Granger causality test of information flow showed term. that the overall distance sometimes leads the volatility indices. However, since the quality of over-time clustering depends heavily on the set of selected variables, and 7. Conclusion and Outlook maintaining the data quality and delivering them timely demands high efort, the overall distance is yet to be a useful addition to the traditional early warning indicator such as volatility indices.

If regulatory risk limit breaches or significant changes in the risk profile occur, a risk manager must manually analyze over ten thousand potential statistical sources, in order to support portfolio managers and institutional investors. In this paper we explored, how to extract sup6. Discussion portive graph structures for this task based on real data provided to us by a large asset management company.

The applied research has shown that the use of AHC Our initial bottom up scheme to build such a causal and network visualization helps risk managers to quickly graph based on public trade and portfolio data using get an overview over hundreds of portfolios. The most ETE and PCMCI encountered a long run-time, and the important insight from a practical perspective was the resulting network that is too complex for human underability to see the efect of risk factors on portfolios across standing. The absence of a reliable evaluation scheme traditional asset classes like equities, bonds, foreign cur- prevented the approach to be readily used in practice. rencies, etc.. The traditional set-up of equity, multi-asset, The combination of AHC of portfolio risk profiles based and bonds portfolios, including overlay strategies using on their VaR exposure ratio and the network visualizamore complicated derivative strategies, did not allow a tion of clustering results, is an easier deployable method quick analysis of risk factors across asset classes and and presents a more practical solution. While not being thus across diferent portfolio types. Focusing on the able to identify the causal chains directly, it fulfills the portfolio with the highest risk concentrations and the demand on transparency and gives risk managers a better (theoretically, unlikely) neighboring portfolios in its clus- overview to numerous statistical sources beyond existing ter, allows risk managers to perform a more targeted root categorization schemes and analysis strategies. cause analysis before risk limit breaches occur. An anal- While we only present the results and experiences ysis that would have taken over an hour can be carried from a very preliminary proof of concept implementaout in a few minutes with the combination of AHC and tion, we believe that in the future the value of machine network visualization. learning over manual analysis may increase if sources

The combination of AHC analysis on VaR exposure other than numerical trade data can be included into ratio and network visualization clearly supersedes the the analysis. Three challenges currently hinder practical use of a causal graph regarding the practicality and tech- application of causal discovery in risk management: scalnical requirements. Yet, the method is not without short- ability, visualization of a complex graph, and systematic comings. The window size afects the clustering results. evaluation of the resulting knowledge graph with miniAlthough sensitivity analysis can be performed and the mal reliance on experts. We are currently investigating if results qualitatively evaluated, the window size cannot recent developments like the two-phase ensemble algobe systematically optimized; because we cannot quanti- rithm that enables PCMCI to better utilize distributed systatively measure which clustering results are more rea- tems [22] may address scalability. The problem of overly sonable in an exploratory analysis. Besides, the rolling complex visualization needs to be addressed with graphwindow AHC is based only on the VaR exposure ratio, reduction techniques, showing to risk managers only subgraphs with unusual development. We plan to extend [11] J. Runge, P. Nowack, M. Kretschmer, S. Flaxman, our work to use Causal NLP, a technique to learn causal D. Sejdinovic, Detecting and quantifying causal graphs from text. It may be used to evaluate the causal associations in large nonlinear time series datasets, graph derived from trade and economic data. Providing Science Advances 5 (2019) eaau4996. human-readable annotations might also solve problems [12] D. B. Group, Deutsche börse public dataset of human oversight when dealing with a large data set. (DBG PDS), 2021. URL: https://github.com/ Deutsche-Boerse/dbg-pds, original-date: 2017-0824T10:22:58Z.

Acknowledgments [13] T. Schreiber, Measuring information transfer, Phys. Rev. Lett. 85 (2000) 461–464.

This work was supported by the German Federal Ministry [14] T. Dimpfl, F. J. Peter, Using transfer entropy to meaof Research (BMBF) as part of the Smart Data Innovation sure information flows between financial markets, Challenge (01IS19030A) Studies in Nonlinear Dynamics and Econometrics 17 (2013) 85–102.

References [15] P. Jizba, H. Kleinert, M. Shefaat, Rényi’s information transfer between financial time series, Physica [1] J. Pearl, Causality, 2 ed., Cambridge University A: Statistical Mechanics and its Applications 391

Press, United Kingdom, 2009. (2012) 2971–2989. [2] E. F. Fama, Eficient capital markets: A review [16] L. Junior, A. Mullokandov, D. Kenett, Dependency of theory and empirical work, The Journal of Fi- relations among international stock market indices, nance 25 (1970) 383–417. URL: http://www.jstor. Journal of Risk and Financial Management 8 (2015) org/stable/2325486. 227–265. [3] C. W. J. Granger, Investigating causal relations by [17] S. K. Stavroglou, K. Soramaki, K. Zuev, Causaleconometric models and cross-spectral methods, ity networks of financial assets, The Journal of Econometrica 37 (1969) 424–438. URL: http://www. Network Theory in Finance 3 (2017) 17–67. doi:10. jstor.org/stable/1912791. 21314/JNTF.2017.029. [4] J. Peters, D. Janzing, B. Schlkopf, Elements of Causal [18] J. Runge, J. Heitzig, V. Petoukhov, J. Kurths, EsInference: Foundations and Learning Algorithms, caping the curse of dimensionality in estimating The MIT Press, USA, 2017. multivariate transfer entropy, Physical Review Let[5] R. Scheines, An Introduction to Causal Inference, ters 108 (2012) 258701.

University of Notre Dame Press, Notre Dame, 1997, [19] S. Aghabozorgi, A. Seyed Shirkhorshidi, pp. 185–200. T. Ying Wah, Time-series clustering – a decade [6] R. Marschinski, H. Kantz, Analysing the informa- review, Information Systems 53 (2015) 16–38. tion flow between financial time series: An im- [20] J. Papenbrock, Asset Clusters and Asset Networks proved estimator for transfer entropy, The Euro- in Financial Risk Management and Portfolio Optipean Physical Journal B 30 (2002) 275–281. mization, Ph.D. thesis, Karlsruhe Institute of Tech[7] A. Levin, A. Tchernitser, Chapter 11 - multifactor nology, Karlsruhe, Germany, 2011. stochastic variance models in risk management: [21] M. Tatusch, G. Klassen, M. Bravidor, S. Conrad, Maximum entropy approach and lévy processes*, Show me your friends and i’ll tell you who you are. in: S. T. Rachev (Ed.), Handbook of Heavy Tailed ifnding anomalous time series by conspicuous clusDistributions in Finance, volume 1 of Handbooks ter transitions, in: T. D. Le, K.-L. Ong, Y. Zhao, W. H. in Finance, North-Holland, Amsterdam, 2003, pp. Jin, S. Wong, L. Liu, G. Williams (Eds.), Data Mining, 443–480. volume 1127, Springer Singapore, Singapore, 2019, [8] M. Beine, G. Capelle-Blancard, H. Raymond, Inter- pp. 91–103.

national nonlinear causality between stock markets, [22] P. Guo, Y. Huang, J. Wang, Scalable and flexible twoThe European Journal of Finance 14 (2008) 663–686. phase ensemble algorithms for causality discovery, [9] M. Nauta, d. Bucur, C. Seifert, Causal discovery with Big Data Research 26 (2021) 100252. attention-based convolutional neural networks, Machine Learning and Knowledge Extraction 1 (2019) 312–340. [10] S. Behrendt, T. Dimpfl, F. J. Peter, D. J. Zimmermann,

Rtransferentropy — quantifying information flow between diferent time series using efective transfer entropy, SoftwareX 10 (2019) 100265.