<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Extracting Causal Graph Structures from Trade Data and Smart Financial Portfolio Risk Management</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ployplearn Ravivanpong</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Till Riedel</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pascal Stock</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Frankfurt School of Finance and Management</institution>
          ,
          <addr-line>Adickesallee 32-34, 60322 Frankfurt am Main</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Karlsruhe Institute of Technology</institution>
          ,
          <addr-line>Kaiserstraße 12, 76131 Karlsruhe</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Risk managers of asset management companies monitor portfolio risk metrics such as the Value at Risk in order to analyze and to communicate the risks timely to portfolio managers, and to ensure regulatory compliance. They must investigate the possible causes if a portfolio risk significantly increases or breaches a regulatory limit. However, monitoring can quickly become overwhelming, time and labor-intensive as each risk manager has to deal with over a hundred portfolios, numerous daily market data, and hundreds of risk factors of the supervised portfolios and of their securities. Particularly, understanding the interrelations between incidents in diferent portfolios beyond high level indicators is important. However, analyzing these interrelations manually is one of the most dificult tasks. In this paper, we describe and demonstrate how automatically generating causal graphs can address the capacity problem of practitioners in risk management, who are facing more and more capital markets based risk data daily on the portfolio level alone. Based on a proof of concept implementation, we compare a pairwise causal-inference-based approach with a clustering-based construction approach. We discuss the advantages and disadvantages of both approaches, both computationally and based on the resulting structure. Based on our initial findings, we outline further challenges and research topics.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;risk management</kwd>
        <kwd>causal inference</kwd>
        <kwd>agglomerative hierarchical clustering</kwd>
        <kwd>network visualization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>When allocating a part of the portfolio to assets with a</title>
        <p>lower risk profile, e.g. German government bonds,
portDespite the fact that the financial risk management do- folio managers take care that these assets are minimally
main is already matured with regard to employed econo- correlated to those assets in the portfolio with high risks,
metric models, is well regulated and requires high trans- e.g. emerging market stocks and bonds. Yet a causal
relaparency, we believe that it can benefit from machine tionship, undetected by traditional methods or unaware
learning to improve daily portfolio risk management. A of by risk and portfolio managers, between underlying
asrisk manager in an asset management company must sets can unexpectedly lead to a high correlation between
maintain an overview of 100 to 250 portfolios daily. Each several portfolios of diferent types during a small crisis.
of these portfolios includes an aggregate of over 160 risk In one instance the VaR ratio of an energy portfolio and a
factors. The most important risk factors are equity risk South East Asian (SEA) small-market-capitalization
portfactors for stocks, interest rates and yield-curve risk fac- folio increased significantly. To trace the possible causes,
tors for fixed-income securities, and foreign exchange a risk manager examined the marginal losses of assets in
risk factors. Without modern risk management software both portfolios. The stocks of the state-own Venezuelan
and aggregate risk measures like the Value-at-Risk (VaR) oil company were found to be the cause for the increased
that is required by the European and recently US regu- VaR ratio of the energy portfolio. Only through
examinlation, it would be impossible for one risk manager to ing model parameters, it was found out two days after
manage and analyze over 16,000 portfolio risk factors the incident that nine SEA companies supplied machines
manually on a daily basis. Nonetheless, if there is a regu- to the Venezuelan oil company. Both portfolios indirectly
latory risk limit breach, a deep causal analysis into port- sufered from the US-embargo on the Venezuelan
govfolios is necessary to mitigate risk and potential losses. ernment. Any technology that helps to uncover such a
Such analyses are time intensive and requires manually “causal mechanism” will significantly reduce the immense
examining on average 98 securities, each having over 160 time needed to complete the analysis.
risk factors. An intelligent automation of the analyses is Such root cause analysis of a change in risk profile
thus needed to enable more eficient root cause analyses. or limit breach, can be addressed by employing causal
graph, graph visualization, and graph anomaly detection.</p>
        <p>If causal relationships between assets and market factors
can be reliably derived, we can monitor their dynamics
and anomalous development using network analysis and
visualization to obtain an overview across all portfolios,
Published in the Workshop Proceedings of the EDBT/ICDT 2022 Joint
Conference (March 29-April 1, 2022), Edinburgh, UK
$ ployplearn.ravivanpong@kit.edu (P. Ravivanpong);
till.riedel@kit.edu (T. Riedel); p.stock@fs.de (P. Stock)</p>
        <p>© 2022 Copyright for this paper by its authors. Use permitted under Creative
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g CCoEmmUoRns LWiceonsrekAstthribouptionP4r.0oIncteerenadtiionnagl s(CC(CBYE4U.0)R.-WS.org)
even though the network may only suggest the causal ear statistical dependency, the presence of non-linear
chains. dependency has also been detected [6][7]. A cyclical
rela</p>
        <p>Within an exploratory research project with a Ger- tionship or a direction switch due to structural changes,
man asset management company, we applied existing between economic factors has also been documented [8].
methods to derive a causal graph of securities based on a The securities that are not publicly traded, also known
pairwise analysis of a subset of portfolios. Our prelimi- as over-the-counter (OTCs), can be viewed as potential
nary results, however, did not meet the user requirements. unmeasured confounders in the data. Yet, information
Consequently, we employed agglomerative hierarchical on the current condition of unobserved companies may
clustering (AHC) of portfolio risk profiles with network be reflected in the market data and the prices of publicly
visualization as a simpler and more practical alterna- traded securities; because portfolio managers incorporate
tive. In this paper, we describe application requirements, them in their the asset allocation strategies and trading
our approaches to the technical as well as requirement behavior [2]. Assuming at least semi-strong market
efchallenges, and demonstrate the use of AHC and net- ficiency thus allow us to assume causal suficiency to a
work visualization to support investment portfolio risk certain extent. We do not take contemporaneous
relamanagement in practice. We discuss the advantages and tionship into account yet, in order to avoid noises
creshortcomings of the approaches, and finally outline our ated by hyper-traders and short-live panic trade-behavior.
perspective on further research topics for the presented Keeping these conditions in mind, the suitable causal
application. discovery methods must be capable of deriving causal
relationships from a large number of time series, are
non-parametric and do not strictly impose acyclicity.
2. Background and Related Work Considering the above assumptions together with the
technical requirements from the risk management,
existing causal inference methods3 that are readily
applicable, satisfy the majority of the conditions, and
theoretically scalable are Efective Transfer Entropy (ETE) [ 10]
and Peter-Clark Momentary Conditional Independence
(PCMCI) method [11].</p>
        <p>A causal graph or a causal diagram is a directed graph
that visualizes causal relationships between variables. A
node represents a variable. A directed edge  →− 
means that  is the direct cause of , i.e. changing 
will result in the change in , all else being equal [1]. To
understand the interaction between market actors and
risk factors, the information reflecting the real activity of
companies (e.g. net cash-flow) should be used to derive 3. Data and Requirements
the causal graph1. Since it is infeasible to acquire and
curate structured data of such information, trade data The choice of approaches depends not only on the
availare usually used as a proxy, assuming that those infor- able data but also on the technical and user requirements.
mation are reflected in security prices according to the The available data for our study are the internal portfolio
eficient market hypotheses [ 2]. Typically, a 1-day return risk data and the Deutsche Börse Public Dataset (PDS).
 = − 1 − 1 is used, which is the percentage change of The internal data consist of portfolio metadata, such as
today’s (closing) price from the previous trading day. In the ISIN, class of the assets, and their aggregate portfolio
this case, a node in the resulting causal graph presents a risk measures (e.g. UCITS gross exposure, present
valsecurity or an economic factor, such as the interest rate or ues) and risk factors (e.g. change in oil prices, interest
the oil price. An edge  →−  can be viewed generally rates). The portfolio risk measures data is longitudinal.
as the Granger causal direction, i.e.  contains informa- Each portfolio has several time series of risk measures
tion to forecast  [3]. Risk managers can trace the paths and factors. Since portfolios contain diferent asset types,
of the information flow to the security in question and like only stocks or bonds, or a mixture of them, some
use them as a basis to determine the actual causation. risk measures do not exist for all portfolios, resulting</p>
        <p>The choice of an algorithm to derive a causal graph in missing values. The PDS consists of the initial price,
from time series depends on the assumptions2 that can lowest price, highest price, final price, and volume of all
be made about the underlying causal structure: linear securities traded on the Eurex and Xetra trading systems
vs. non-linear dependency, acyclicity, causal suficiency aggregated in minute-interval [12]. The internal data set
(no unmeasured confounder [5]), and contemporaneous can be linked with the PDS using the asset ISIN.
Howrelationship. Although risk models typically assume lin- ever, the investigated portfolios also consist of currency,
1The graph is also known as a financial network.</p>
        <p>2Since we assume the structural causal model exists in our case,
we also assume Markov condition. Additionally, we also assume
faithfulness, allowing us to infer dependencies from the resulting
graph [4].</p>
      </sec>
      <sec id="sec-1-2">
        <title>3We initially also considered Temporal Causal Discovery Frame</title>
        <p>work (TCDF) [9] because of its ability to detect the presence of hidden
variables (which is beneficial yet not strictly necessary). However,
the method was rejected due to its use of a complex black-box
Attentive Convolutional Neural Network.</p>
        <p>OTCs, and derivatives. These assets cannot be matched The major drawback of the ETE is the number of
hyperwith the available trade data. Since the internal data are parameters to consider. This includes the number of lags,
aggregated on a daily basis, only the daily closing price the discretization method, the Rényi entropy weighting
of the trade data is relevant. parameter, the number of shufles, and the number of</p>
        <p>Risk management requires a combination of high trans- bootstraps for statistical inference. A sensitivity analysis
parency, explainability, and timely detection as well as can be performed to get a robust result, but is
computaaction. Transparency and explainability are required, tionally expensive. Also, it does not address the hidden
not only for external communication to institutional in- variable problem and eventually sufers from the curse of
vestors and the regulator, like the German BaFin, but also dimensionality as the number of variables and lags gets
for internal communication to portfolio managers. This larger [18]. Nevertheless, the ETE is chosen because the
currently means that all the variables that enter a model concept can be interpreted as a non-parametric Granger
and the underlying assumptions (e.g. for proxy variables) causality and its application is accepted in the financial
must be known, while the reasons they are used and their domain.
efects on the models must be understood. Furthermore, The PCMCI is designed under the causal discovery
the selected methods must enable a risk manager to de- framework and with parallelization in mind. It uses the
rive and describe the mechanism of risk development to PC algorithm to quickly first identify potential causes
stakeholders and regulators. Hence, black-box models of an interested variable and prunes them using a
conare rejected by default unless there is an acceptable justi- ditional independence test. As a result, the algorithms
ifcation while generated features must have economic or can detect small efects, given a significance level. It also
statistical meanings. has fewer hyperparameters to consider. Apart from the</p>
        <p>Given that it usually takes 48 hours for senior risk man- number of lags and the number of variable combinations
agers to manually find the root cause of a risk limit breach in the conditional independence test, users have only to
and that the complex traditional risk model calculation specify the significance level to limit the false positive
is finished overnight, the selected algorithm run-time rates. By using the CMI non-parametric test, the method
must ideally be within 12 hours on the available compute can identify both linear and non-linear dependencies [11].
resources (we assumed a workstation with 24 CPU cores Despite the algorithm assumptions of no hidden variable,
as a realistic price tag). acyclicity, and stationary time series, it is selected for our
preliminary study because it is based on conditional
independence test, a concept familiar in finance, can output
a temporal causal graph, and is parallelizable.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Deriving Causal Graph of</title>
    </sec>
    <sec id="sec-3">
      <title>Underlying Assets</title>
      <p>We initially chose Efective Transfer Entropy (ETE) to
build the graph in our experiments. Transfer entropy (TE)
is an information-theoretic measure that quantifies the
information flow between two time series, while ignoring
their static correlations due to common cause[13]. The
method is non-parametric and thus can detect both linear
and non-linear statistical dependencies. Moreover, it is
well researched and widely applied in causality learning
from financial data [ 6][14] [15][16][17]. We use the Rényi
entropy as a basis for the TE because it uses a weighting
parameter, which allows us to focus on diferent areas
of distribution [10]. The ETE is the TE of the pair of
original time series adjusted by the TE of the shufled
data. This enables the ETE method to be able to detect
small efects, as a consequence of limited data and many
variables, such is often the case of financial time series [ 6].
A significance test is performed on ETE of each security
pair in each direction. Only edges with positive ETE and
p-value less than a pre-defined threshold (e.g. 5%) are
kept. The result is a causal graph4.</p>
      <sec id="sec-3-1">
        <title>4[4] cautions that the size of TE should not be viewed as quan</title>
        <p>titative causal strength. Therefore, it can be interpreted at best as
stronger or weaker dependencies, analogous to the interpretation of correlation.</p>
        <p>We applied both ETE and PCMCI on the 1-day returns
of securities in the portfolios. Missing values due to
nontrading activity are filled with the last closing price for
simplicity. Seldom traded securities, whose data contain
mostly missing values, are removed. The results were of
mixed success. First, we encountered a computational that the subgroups and their re-formation, given
diferchallenge when applying both methods using their origi- ent market situations, might expose common risk drivers,
nal implementation. Although both methods have been without having to analyze the underlying assets.
applied to financial data as a proof of concept, none of For an exploratory study to uncover an alternative
the experiments involve over seven hundred time series. grouping of portfolios, a suitable clustering method
Even though we managed to improve the parallelization must not require users to specify the number of
clusof ETE, it still took almost six hours on 24 CPUs and 72 ters. Among them, agglomerative hierarchical clustering
GB RAM for analyzing roughly 700 time-series of over (AHC) is the most practical. The nearest-neighbor chain
two years daily returns of the investigated securities that algorithm, which is available in most statistical software,
are part of the given sample of 50 portfolios. The long is fast and always yields the same clustering if none of
run time for ETE was due to shufling and bootstrap- the instance pairs have the same minimum distance.The
ping in combination with the pairwise test. The CMI, clustering result can be visualized as a dendrogram,
rewhich is a fully non-parametric test, in the second part gardless of the number of features, showing the class
of PCMCI has a long run time and does not scale well for hierarchy. Due to these advantages, especially for the
a high number of time series. With the computational exploration of new categorization, we select the AHC for
resources, that could be realistically be made available our application.
at the time, it could not be ensured that the run time
will not significantly exceed 12 hours when analyzing
additional thousands of assets and economic factors. As
the algorithm is subject to quadratic scaling, we saw it
as impractical given that we only had analyzed a
relatively small subset. Also, the resulting causal graph, in
which each asset is presented as a node, is complex to
interpret both manually and automatically. An accuracy
evaluation was infeasible without access to knowledge
of existing risk models.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Agglomerative Hierarchical</title>
    </sec>
    <sec id="sec-5">
      <title>Clustering of Risk Profiles</title>
      <sec id="sec-5-1">
        <title>Due to the complexities of asset-level causal inference as</title>
        <p>a basis for identifying risk drivers and their
interrelationship, it makes sense to look for alternative approaches.</p>
        <p>One major requirement should be that such an approach:
(i) intuitively shows the relationship to the monitored risk
indicators; (ii) can be eficiently calculated automatically
on a continuous basis; and (iii) creates structures that
expose causal relationships beyond already established
classification schemes.</p>
        <p>In contrast to an eager bottom-up approach as
presented above, an alternative might lie in a top-down
approach that examines the indicator in question. The
interrelationship between diferent portfolios should
already be captured in the risk models, and thus be
observable at the indicator level. Indicators have the advantage
that they are already normalized and are designed for
comparison. However, they are typically not used to
structure information. This can be done by employing
clustering analysis on the time series created by those
indicators. While measuring distance between totally
different kinds of portfolios in all market situations might
be of limited value, subspace clustering approaches have
the advantage that they can build hierarchies based on
local similarities within subgroups. Our assumption is</p>
        <p>The AHC method is originally developed for
crosssectional data. The adaption for longitudinal data
clustering is a three-step decision process. First, one decides
if the clustering should be performed along the time axis
(whole time series clustering or rolling window
clustering) or on all features, at each time point while ignoring
the time axis (over-time clustering). The second step is
to choose the similarity measure. Four approaches are
usually found in practice [19][20]: shape-based,
featurebased, model-based, and compression-based. In the
last step, one chooses the clustering algorithm (linkage
method). Since model-based and compression-based
approaches generate complex features and transparency
plays a major role in our case, we are limited to
shapebased and feature-based similarity measures. Figure 2
summarizes the two main approaches for our use-case.</p>
        <p>Each of them presents diferent views about risk
similarity.
5.1. Rolling window clustering
Rolling window clustering is a variation of whole time
series clustering, which groups a set of similar time
series together [19]. Since the risk and market indicators vestors and private wealth management was also among
are updated almost daily and their previous year values the cluster members. Such a close association between
should contribute little to their current trend, we only a defensive multi-asset fund (invested only 30% global
need to cluster the recent values and update the cluster- stocks and 70% in bonds) and several German equity
flaging with rolling windows. Initially, both shaped-based ship funds (100% German stocks), was not obvious at first
and feature-based approaches are under consideration. glance. A look into their shared risk factors within their
Even though the feature-based approach is more robust cluster, which took a few minutes, showed the newly
against missing values and is suitable for multivariate listed stock of Siemens Healthineers was the common
time series, it was also found during the preliminary anal- cause; because it experienced an increase in volatility on
ysis that the generated features add more interpretation its first trading days and all the portfolios were heavily
dificulty and reduces quick communication. Fortunately, invested in shares of the Siemens AG, the parent
comwe can avoid the multivariate time series conversion and pany. Such analysis that would have taken over an hour
the missing value problems in the shape-based approach or more, even though the IPO of the spin-of was well
by using the VaR exposure ratio. The VaR exposure ratio known, became visible and obvious in a few minutes for
is calculated from all relevant risk factors of a portfolio the risk managers using the AHC and network
visualizaand scaled by its reference portfolio. Due to the strict tion.
regulatory requirements laid down in the German
Derivative Regulation, it must be calculated for all regulated 5.2. Over-time clustering
portfolios and may not contain missing values. Hence,
the shape-based approach is preferred. The Ward’s link- The result of the rolling window clustering is afected by
age method is selected for its ability to identify distinct the window size. Besides, by using only one aggregated
clusters. Consequently, we are limited to the Euclidean indicator, the method ignores additional information that
distance. may be contained in other risk measures. In order to take
into account multiple variables while keeping feature
engineering at a minimum, one can treat the data at each
time point as cross-sectional data and perform a cluster
analysis on them. The approach is known as over-time
clustering [21]. This means we make an assumption that
all the data at each time point already contain all
information from its lags, allowing us to ignore the time feature
in the clustering. By keeping a fixed set of variables to
be clustered, one can analyze the cluster dynamics over
time and thus identify anomalous development [21].</p>
      </sec>
      <sec id="sec-5-2">
        <title>An example of the rolling window clustering of port</title>
        <p>folios based on their VaR exposure ratio of the last 90
trading days results in groups of portfolios as shown in
Figure 3. The network visualization is created by
applying the minimum spanning tree (MST) algorithm to
the cophenetic distance matrix [20]. The force-based
Fruchterman-Rhiengold layout is the best suited for
visualization as it put cluster members as to each other
while separating the clusters from one another as much
as possible. The trafic light color scheme corresponding
to risk limit breach frequency helps risk managers to
spot high-risk portfolios. A qualitative evaluation shows Figure 4: The overall distance at each time point (top), its
that portfolios of the same theoretical type are correctly daily percentage change (middle) and the VSTOXX and VIX
grouped together, but with some exceptions. For exam- volatility indices (bottom).
ple, the German equity portfolios lied close together in a
cluster as expected. Yet, one of the portfolio manager’s We applied the overtime clustering using Euclidean
most important defensive multi-asset fund for retail in- distance and Ward’s linkage on around 30 risk measures.
Since we are interested in the dynamics, only variables which is based on risk models that assume linear
dethat exist for all portfolios that seldom have missing val- pendency and are handcrafted from expert knowledge.
ues, excluding the VaR exposure ratio, are used. Figure Thus the AHC only summarizes existing information and
4 compares the final Ward’s linkage distance or overall presents them in a more structured way to risk and
portdistance at each time point (top plot) to the VSTOXX and folio managers. As such, it still misses signals that experts
VIX volatility indices (bottom plot). They measure the im- are unaware of and little new knowledge is gained. Since
plied volatility of the stocks in the EuroSTOXX50 index the resulting network is undirected, the AHC is unable
and the S&amp;P500 index. These indices are commonly used to suggest the possible causal paths that could further
as indicators of the overall stock market volatility, and reduce deep dive analysis efort should a risk limit breach
thus the implied risk. The lower overall distance reflects occur. It is also debatable how early the AHC can
visuthe higher similarity between portfolio risk profiles, indi- alize the weak efect of some common causes between
cating that some common risk factors are exerting their portfolios before the efect becomes apparent. The AHC
influence across the board. A steep decrease in the overall may be a promising and practical solution in the medium
distance is found to coincide with high market volatility. term. But a practical causal graph is desirable in the long
Our Granger causality test of information flow showed term.
that the overall distance sometimes leads the volatility
indices. However, since the quality of over-time
clustering depends heavily on the set of selected variables, and 7. Conclusion and Outlook
maintaining the data quality and delivering them timely
demands high efort, the overall distance is yet to be a
useful addition to the traditional early warning indicator
such as volatility indices.</p>
        <p>If regulatory risk limit breaches or significant changes
in the risk profile occur, a risk manager must manually
analyze over ten thousand potential statistical sources,
in order to support portfolio managers and institutional
investors. In this paper we explored, how to extract
sup6. Discussion portive graph structures for this task based on real data
provided to us by a large asset management company.</p>
        <p>The applied research has shown that the use of AHC Our initial bottom up scheme to build such a causal
and network visualization helps risk managers to quickly graph based on public trade and portfolio data using
get an overview over hundreds of portfolios. The most ETE and PCMCI encountered a long run-time, and the
important insight from a practical perspective was the resulting network that is too complex for human
underability to see the efect of risk factors on portfolios across standing. The absence of a reliable evaluation scheme
traditional asset classes like equities, bonds, foreign cur- prevented the approach to be readily used in practice.
rencies, etc.. The traditional set-up of equity, multi-asset, The combination of AHC of portfolio risk profiles based
and bonds portfolios, including overlay strategies using on their VaR exposure ratio and the network
visualizamore complicated derivative strategies, did not allow a tion of clustering results, is an easier deployable method
quick analysis of risk factors across asset classes and and presents a more practical solution. While not being
thus across diferent portfolio types. Focusing on the able to identify the causal chains directly, it fulfills the
portfolio with the highest risk concentrations and the demand on transparency and gives risk managers a better
(theoretically, unlikely) neighboring portfolios in its clus- overview to numerous statistical sources beyond existing
ter, allows risk managers to perform a more targeted root categorization schemes and analysis strategies.
cause analysis before risk limit breaches occur. An anal- While we only present the results and experiences
ysis that would have taken over an hour can be carried from a very preliminary proof of concept
implementaout in a few minutes with the combination of AHC and tion, we believe that in the future the value of machine
network visualization. learning over manual analysis may increase if sources</p>
        <p>The combination of AHC analysis on VaR exposure other than numerical trade data can be included into
ratio and network visualization clearly supersedes the the analysis. Three challenges currently hinder practical
use of a causal graph regarding the practicality and tech- application of causal discovery in risk management:
scalnical requirements. Yet, the method is not without short- ability, visualization of a complex graph, and systematic
comings. The window size afects the clustering results. evaluation of the resulting knowledge graph with
miniAlthough sensitivity analysis can be performed and the mal reliance on experts. We are currently investigating if
results qualitatively evaluated, the window size cannot recent developments like the two-phase ensemble
algobe systematically optimized; because we cannot quanti- rithm that enables PCMCI to better utilize distributed
systatively measure which clustering results are more rea- tems [22] may address scalability. The problem of overly
sonable in an exploratory analysis. Besides, the rolling complex visualization needs to be addressed with
graphwindow AHC is based only on the VaR exposure ratio, reduction techniques, showing to risk managers only
subgraphs with unusual development. We plan to extend [11] J. Runge, P. Nowack, M. Kretschmer, S. Flaxman,
our work to use Causal NLP, a technique to learn causal D. Sejdinovic, Detecting and quantifying causal
graphs from text. It may be used to evaluate the causal associations in large nonlinear time series datasets,
graph derived from trade and economic data. Providing Science Advances 5 (2019) eaau4996.
human-readable annotations might also solve problems [12] D. B. Group, Deutsche börse public dataset
of human oversight when dealing with a large data set. (DBG PDS), 2021. URL: https://github.com/
Deutsche-Boerse/dbg-pds, original-date:
2017-0824T10:22:58Z.</p>
        <p>Acknowledgments [13] T. Schreiber, Measuring information transfer, Phys.
Rev. Lett. 85 (2000) 461–464.</p>
        <p>This work was supported by the German Federal Ministry [14] T. Dimpfl, F. J. Peter, Using transfer entropy to
meaof Research (BMBF) as part of the Smart Data Innovation sure information flows between financial markets,
Challenge (01IS19030A) Studies in Nonlinear Dynamics and Econometrics
17 (2013) 85–102.</p>
        <p>References [15] P. Jizba, H. Kleinert, M. Shefaat, Rényi’s
information transfer between financial time series, Physica
[1] J. Pearl, Causality, 2 ed., Cambridge University A: Statistical Mechanics and its Applications 391</p>
        <p>Press, United Kingdom, 2009. (2012) 2971–2989.
[2] E. F. Fama, Eficient capital markets: A review [16] L. Junior, A. Mullokandov, D. Kenett, Dependency
of theory and empirical work, The Journal of Fi- relations among international stock market indices,
nance 25 (1970) 383–417. URL: http://www.jstor. Journal of Risk and Financial Management 8 (2015)
org/stable/2325486. 227–265.
[3] C. W. J. Granger, Investigating causal relations by [17] S. K. Stavroglou, K. Soramaki, K. Zuev,
Causaleconometric models and cross-spectral methods, ity networks of financial assets, The Journal of
Econometrica 37 (1969) 424–438. URL: http://www. Network Theory in Finance 3 (2017) 17–67. doi:10.
jstor.org/stable/1912791. 21314/JNTF.2017.029.
[4] J. Peters, D. Janzing, B. Schlkopf, Elements of Causal [18] J. Runge, J. Heitzig, V. Petoukhov, J. Kurths,
EsInference: Foundations and Learning Algorithms, caping the curse of dimensionality in estimating
The MIT Press, USA, 2017. multivariate transfer entropy, Physical Review
Let[5] R. Scheines, An Introduction to Causal Inference, ters 108 (2012) 258701.</p>
        <p>University of Notre Dame Press, Notre Dame, 1997, [19] S. Aghabozorgi, A. Seyed Shirkhorshidi,
pp. 185–200. T. Ying Wah, Time-series clustering – a decade
[6] R. Marschinski, H. Kantz, Analysing the informa- review, Information Systems 53 (2015) 16–38.
tion flow between financial time series: An im- [20] J. Papenbrock, Asset Clusters and Asset Networks
proved estimator for transfer entropy, The Euro- in Financial Risk Management and Portfolio
Optipean Physical Journal B 30 (2002) 275–281. mization, Ph.D. thesis, Karlsruhe Institute of
Tech[7] A. Levin, A. Tchernitser, Chapter 11 - multifactor nology, Karlsruhe, Germany, 2011.
stochastic variance models in risk management: [21] M. Tatusch, G. Klassen, M. Bravidor, S. Conrad,
Maximum entropy approach and lévy processes*, Show me your friends and i’ll tell you who you are.
in: S. T. Rachev (Ed.), Handbook of Heavy Tailed ifnding anomalous time series by conspicuous
clusDistributions in Finance, volume 1 of Handbooks ter transitions, in: T. D. Le, K.-L. Ong, Y. Zhao, W. H.
in Finance, North-Holland, Amsterdam, 2003, pp. Jin, S. Wong, L. Liu, G. Williams (Eds.), Data Mining,
443–480. volume 1127, Springer Singapore, Singapore, 2019,
[8] M. Beine, G. Capelle-Blancard, H. Raymond, Inter- pp. 91–103.</p>
        <p>national nonlinear causality between stock markets, [22] P. Guo, Y. Huang, J. Wang, Scalable and flexible
twoThe European Journal of Finance 14 (2008) 663–686. phase ensemble algorithms for causality discovery,
[9] M. Nauta, d. Bucur, C. Seifert, Causal discovery with Big Data Research 26 (2021) 100252.
attention-based convolutional neural networks,
Machine Learning and Knowledge Extraction 1 (2019)
312–340.
[10] S. Behrendt, T. Dimpfl, F. J. Peter, D. J. Zimmermann,</p>
        <p>Rtransferentropy — quantifying information flow
between diferent time series using efective
transfer entropy, SoftwareX 10 (2019) 100265.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>