<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Modeling and Generating Extreme Volumes of Financial Synthetic Time-Series Data with Knowledge Graphs</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Laurentiu</forename><surname>Vasiliu</surname></persName>
							<email>laurentiu.vasiliu@peracton.com</email>
							<affiliation key="aff0">
								<orgName type="department">Peracton Ltd. DHKN Galway Financial Services Centre</orgName>
								<address>
									<addrLine>Moneenageisha Rd</addrLine>
									<postCode>H91 V2R6</postCode>
									<settlement>Galway</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">S</forename><surname>Haleh</surname></persName>
						</author>
						<author>
							<persName><forename type="first">S</forename><surname>Dizaji</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Institute of Information Technology</orgName>
								<orgName type="institution">University of Klagenfurt</orgName>
								<address>
									<addrLine>Universitätsstraße 65-67</addrLine>
									<postCode>A-9020</postCode>
									<settlement>Klagenfurt am Wörthersee</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Aaron</forename><surname>Eberhart</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">metaphacts GmbH</orgName>
								<address>
									<addrLine>36 Daimlerstraße</addrLine>
									<postCode>69190</postCode>
									<settlement>Walldorf</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dumitru</forename><surname>Roman</surname></persName>
							<email>dumitru.roman@sintef.no</email>
							<affiliation key="aff3">
								<orgName type="institution">SINTEF AS</orgName>
								<address>
									<addrLine>Forskningsveien 1</addrLine>
									<postCode>0373</postCode>
									<settlement>Oslo</settlement>
									<country key="NO">Norway</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Radu</forename><surname>Prodan</surname></persName>
							<email>radu.prodan@aau.at</email>
							<affiliation key="aff1">
								<orgName type="department">Institute of Information Technology</orgName>
								<orgName type="institution">University of Klagenfurt</orgName>
								<address>
									<addrLine>Universitätsstraße 65-67</addrLine>
									<postCode>A-9020</postCode>
									<settlement>Klagenfurt am Wörthersee</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Modeling and Generating Extreme Volumes of Financial Synthetic Time-Series Data with Knowledge Graphs</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">05DCC36E94AA0350DAB5494018D1E4FA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:46+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Knowledge graphs</term>
					<term>ontologies</term>
					<term>synthetic data</term>
					<term>financial time-series</term>
					<term>extreme data</term>
					<term>machine learning</term>
					<term>correlation analysis</term>
					<term>pattern recognition</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper outlines the approach and technology employed to model and generate extreme volumes of synthetic financial time-series data. We introduce the Graph-Massivizer project and its financial use case, focusing on green sustainable finance. One project objective is to create synthetic financial data in extreme volumes to facilitate advanced testing and simulations of investment and trading algorithms. Afterward, we provide an overview of the methodology, detailing the utilization of ontologies and knowledge graphs. Furthermore, we elaborate on modeling correlations between different markets' time-series and how we can benefit in combination with graph neural network models to generate financial data. We then present the current implementation status and conclude with a discussion of future work.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the financial investment and trading domains, synthetic data-artificially generated datasets that mimic real-world financial time-series characteristics-has become a robust solution for quantitative analysis and back-testing. The demand for synthetic data has arisen due to increasingly complex financial models and algorithms driven by data-demanding machine learning (ML) models. These models find real historical data time-series to have multiple limitations, such as reduced volumes, high costs, incomplete data, or irrelevance as we go further back in time. The core characteristic of synthetic data is its ability to capture the statistical properties of real-world markets while maintaining a completely artificial nature. This allows for intensive testing before financial models and algorithms are further validated on real-time financial data. The Graph-Massivizer project <ref type="bibr" target="#b0">[1]</ref> aims to develop a software platform consisting of independent yet integrated tools. In one of its use cases, this platform will generate synthetic data in extreme volumes, closely matching the quality and characteristics of historical data samples of stocks and commodities futures, with plans to expand to other securities such as ETFs, bonds, and options. At the core of the approach are knowledge graphs (KGs), chosen for their ability to capture, store, and represent historical financial time-series. All technologies used are designed around creating, processing, storing, and generating these KGs. KGs, designed to represent entities and their relationships utilizing ontologies, can be significantly enhanced. Firstly, ontologies provide a shared vocabulary and semantic alignment, particularly useful when integrating data from different sources across various KGs. Secondly, KGs can leverage ontologies to perform inference and reasoning, allowing the discovery of new relationships within the graph. Thirdly, ontologies can enrich KGs by adding missing information, such as properties or classes that are not explicitly present but are implied based on existing relationships.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Graph-Massivizer Project -The Financial Use Case</head><p>Green and sustainable finance This use case <ref type="bibr" target="#b1">[2]</ref> aims to enhance algorithmic investment and trading capabilities in green-focused products and investment/trading styles by generating and utilizing extreme volumes of synthetic data for testing and training. In this respect, the Graph-Massivizer project seeks to overcome the limitations posed by financial market data providers-such as restricted data volume, reduced accessibility, and high costs, by enabling the rapid, semi-automated creation of realistic and affordable synthetic financial datasets that are unlimited in size and accessibility. It also aims to improve ML-based green investment and trading simulations, eliminating critical biases such as prior knowledge, over-fitting, and indirect contamination due to current data scarcity. The approach first maps samples of historical financial data (stocks and commodities futures) to a massive graph (F-MG) through a time-series to graph transformation. Next, using a generative model, we create a synthetic financial massive graph (SF-MG). Finally, we generate synthetic financial data from the SF-MG by enforcing specific quality rules. To achieve this, the Graph-Massivizer platform is provided with 10 TB of historical data samples, with the primary goal (KPI 1) of generating between 1 and 5 PB of synthetic financial time-series data. Another goal (KPI 2) is to achieve 90% energy consumption accountability for synthetic data creation. We use this data to test and improve financial algorithms, and aim to achieve (KPI 3) a measurable return increase of 2-4% in the enhanced financial algorithms that use synthetic data. Additionally, we aim to achieve (KPI 4) an increase in the financial algorithms' alpha by 1-2% and a Sharpe ratio greater than 1.5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Graph-Massivizer toolkit</head><p>The Graph-Massivizer toolkit is an integrated platform composed of five tools (Graph-Inceptor, Graph-Scrutinizer, Graph-Optimizer, Graph-Greenifier, and Graph-Choreographer) that perform specific and unique functions for massive graph processing:</p><p>• Graph-Inceptor: realizes a massive graph for the system to use.</p><p>• Graph-Scrutinizer: provides analytic capabilities and probabilistic reasoning for insights.</p><p>• Graph-Optimizer: ensures that large graph operations are completed efficiently.</p><p>• Graph-Greenifier: evaluates the energy consumption of massive graph operation.</p><p>• Graph-Choreographer: allows serverless deployment to use resources on-demand. Further on, Figure <ref type="figure" target="#fig_0">1</ref> shows the overall Graph-Massivizer architecture and how the five tools are interconnected. Additionally, we can see in this diagram the external components, such as the metaphactory platform, the graph database, and the hardware and infrastructure used by the toolkit.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Challenges in Modeling Financial Data</head><p>Modeling financial data presents several challenges due to financial markets' dynamic nature and complexities. In addition to numerous variables, volatility clustering, fat tails, and noise, which are all specific to financial data, we focus particularly on five aspects to generate synthetic data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>KG relations' extraction and enrichment</head><p>In large-scale financial data, extracting relationships among various data types can be complex and non-intuitive, often requiring inference methods to identify them. We aim to enhance the quality of KG relation extraction by utilizing ontologies and reasoning methods to identify and extract non-obvious relationships.</p><p>Heterogeneous time-series data Financial data consists of different types of time-series data with different semantics, domains, and dynamics. We can mitigate this diversity by using ontologies underlying their relationships and finding correlations among them. Changing statistical properties Financial time-series often display changing statistical properties over time, such as means, variances, and covariances. These properties are used for measuring an asset's performance and risk. These statistical patterns must be identified and replicated in the synthetic data to model the original data accurately.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Quality assessment of generated data</head><p>The evaluation of synthetic data is ongoing research <ref type="bibr" target="#b2">[3]</ref> and <ref type="bibr" target="#b3">[4]</ref> and <ref type="bibr" target="#b4">[5]</ref> review several evaluation methods of financial and other synthetic time-series. The method of <ref type="bibr" target="#b2">[3]</ref> applies various qualitative, quantitative, and predictive methods. These methods consist of statistical models and distances, such as various distribution divergence metrics, the Kolmogrov-Smirnov test, real and synthetic data correlation analysis, and the non-parametric model of MMD. Other methods evaluate the ML models on real data trained on synthetic data. Additionally, we can evaluate specific quantities such as Value at Risk (VAR). These methods differ depending on the use case, and we aim to select appropriate metrics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Using Ontologies and Knowledge Graphs</head><p>KGs and ontologies allow scientists and domain experts to model complex relations between data in a logically structured and machine-readable format. This capability allows ontologies to connect diverse sources of information, such as the use case presented here and similar related data.</p><p>In the Graph-Massivizer project, ontologies represent and integrate data from diverse use cases. The metaphactory platform was chosen to manage ontologies and integrate data with a front-end interface. Metaphactory has many applications for developing and managing ontologies, KGs, and other related semantic artifacts <ref type="bibr" target="#b5">[6]</ref>. With metaphactory, users can interact with and create ontologies and integrate and use data that aligns with the ontology.</p><p>By decoupling the data and the schema for the data, the ontology allows developers to model and prepare for handling massive amounts of data in an abstract way. For instance, a user or developer can write queries to inspect only the relevant data of interest inside the large data set. While queries like this are not themselves a direct algorithmic optimization, they do play a critical role in ensuring scalability is possible by identifying critical semantic information and metadata that can reduce a huge chunk of data into something more tractable.</p><p>In this section, we will describe the data represented by ontology and then show the ontology that schematizes it to integrate it with a KG.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Ontology data</head><p>This use case initially focuses on two types of financial products: stocks and commodities futures. However, this paper will concentrate on one financial product: stocks. The financial ontology for stocks consists of four main types of financial data: Fundamental data Fundamental data <ref type="bibr" target="#b6">[7]</ref> indicators represent accounting data related to a company and its particular industry. These indicators have a low update frequency (quarterly on average or yearly) <ref type="bibr" target="#b7">[8]</ref> and provide long-term insights into a company's valuation and price evolution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Technical data</head><p>In contrast with fundamental data, technical data [9] has a very high update frequency (tick/second/minute, etc.), offering short-term insights into stock price movements. This data includes fine-grained historical stock price information in the form of Open, High, Low, Close, and Volume (OHLCV). Numerous additional statistics can be derived from this financial modeling and prediction data, particularly for intra-day trading.</p><p>ESG data Environmental, social, governance (ESG) <ref type="bibr" target="#b8">[10]</ref> data measures companies based on various responsibility metrics, including environmental, social, and governance criteria. By considering these criteria in their investments, investors encourage responsible corporate behavior and avoid investing in companies with risky or unethical practices. Sentiment data Market sentiment data <ref type="bibr" target="#b9">[11]</ref> reflects investors' attitudes toward a company, sector, or financial market. Various indicators derived from statistical technical analysis, social media, or alternative data sources can be used to measure market sentiment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Ontology diagram</head><p>The financial ontology overview in Figure <ref type="figure" target="#fig_1">2</ref> shows the objects and data constituting the use case, namely historical and synthetic data and financial algorithms.</p><p>They run inside the PeractonSecuritiesPlatform class that ingests the SyntheticStocksData and SyntheticCommoditiesData generated by the Graph-MassivizerPlatform class.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Synthetic ontology diagram</head><p>The synthetic financial data ontology in Figure <ref type="figure" target="#fig_2">3</ref> shows the created categories and mirrors the original historical financial data set structure. The SyntheticSymbol belongs to SyntheticCommoditiesData and SyntheticStocks classes, with the TechnicalData and FundamentalData classes as features. It also shows the SyntheticFinancialData class with SyntheticCommoditiesMultiverse and SyntheticStocksMultiverse subclasses, with further SyntheticCommoditiesData and SyntheticStocksData subclasses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Related Work</head><p>We briefly review the literature on applying KG in financial data analysis, then elaborate on other financial data modeling and generation methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Ontology in financial data analysis</head><p>Several methods leverage ontology and KG in financial analysis, such as KG extraction and enrichment, querying and reasoning over KGs, extracting correlations, and modeling financial time-series.</p><p>[8] studies the effect of considering fundamental and technical data in stock price prediction ML models, showing that models benefiting from both indicators outperform models considering them alone. The method of <ref type="bibr" target="#b10">[12]</ref> proposes KG extraction, enrichment, and querying methods. <ref type="bibr" target="#b11">[13]</ref> drives a high-quality financial KG given the ontology by a semi-automated method and utilizes this KG in reasoning, stock prediction, and generation with two neural network models, multi-layer perceptron and long short term memory (LSTM). <ref type="bibr" target="#b12">[14]</ref> provides an ontology-based correlation extraction between different companies. It drives the network of companies by assessing time-series and uses the node2vec and k-nearest neighbor (kNN) methods to embed and cluster the extracted nodes. <ref type="bibr" target="#b13">[15]</ref> proposes a joint graph learning and prediction model on time-series data. It uses KG and graph neural networks (GNNs) to derive the correlation among different time-series. The method of <ref type="bibr" target="#b14">[16]</ref> benefits from KGs in finding first and second-order relationships among companies. It applies an LSTM for time-series embedding of each node and a temporal graph convolutional network (GCN) to incorporate the varying neighborhood effect. The method in <ref type="bibr" target="#b15">[17]</ref> formulates stock prediction as a stochastic optimization and introduces genetic programming with a generalized crowding method using financial KG to predict prices.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Financial time-series modeling</head><p>Statistical models Various statistical linear models such as autoregressive models exist for stock prices, however, they can't capture the complex non-linear structure of these data <ref type="bibr" target="#b16">[18]</ref>.</p><p>Event based models These methods formulate time-series data as event data and define it as remarkable changes in time-series in continuous time. The methods of <ref type="bibr" target="#b17">[19]</ref> and <ref type="bibr" target="#b18">[20]</ref> construct correlation (influence) graphs of time-series utilizing the Hawkes process. <ref type="bibr" target="#b18">[20]</ref> defines events as long-lasting volatility values. It uses an attention layer to weigh and capture neighborhoods and an LSTM to predict the next price. The method in <ref type="bibr" target="#b19">[21]</ref> uses an event graph and tackles dimensionality. It formulates the intensity function utilizing GNNs and the attention layer to dynamically embed node features and recurrent neural networks (RNNs) to embed event sequences. Graphical event models are another form of marked point processes event type utilizing graphical information for event graph construction <ref type="bibr" target="#b19">[21]</ref>.</p><p>Pattern recognition These methods perform pattern-matching techniques to predict trends in time-series, including perceptually important points, template matching, and dynamic tree wrapping algorithm. The survey <ref type="bibr" target="#b20">[22]</ref> provides a detailed review of these methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ML models</head><p>The review in <ref type="bibr" target="#b20">[22]</ref> categorizes these models into supervised and unsupervised models. Supervised learning methods use various ML models such as support vector machines, random forest, Adaboost, kNN, and eXtreme gradient boosting methods for stock prediction. The unsupervised learning methods include clustering methods to help in finding correlations among markets <ref type="bibr" target="#b20">[22]</ref>.</p><p>Recently, several studies used deep learning models for modeling financial time-series data consisting of convolutional neural networks, RNNs, attention mechanisms, and generative adversarial networks (GANs) capable of capturing non-linear and complex data features. The survey <ref type="bibr" target="#b21">[23]</ref> provides a comprehensive review of these models.</p><p>GANs are powerful data generation models appropriate for time-series and adjusted for event data generation. The method in <ref type="bibr" target="#b22">[24]</ref> proposes a marked event data generation method using separate generators and discriminators for each event type and preserves type correlations using a central discriminator. This method provides synthetic data for downstream tasks. The GAN model of <ref type="bibr" target="#b16">[18]</ref> constructs the correlation graph among stocks using different correlation analysis methods and uses GCNs to encode interdependent time-series data. The method in <ref type="bibr" target="#b2">[3]</ref> captures correlation among stocks and applies three generative models, including GAN. The survey <ref type="bibr" target="#b3">[4]</ref> presents more models of this category.</p><p>The other category of methods uses natural language processing (NLP) to benefit from financial text data such as financial news and SEC filings. They consist of N-grams and word2vec embeddings as inputs for prediction models. The survey <ref type="bibr" target="#b20">[22]</ref>, and paper <ref type="bibr" target="#b14">[16]</ref> introduce methods of this category.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Correlation Analysis among Financial Data</head><p>Financial data can show various correlations across financial products, companies, and markets. These correlations may be based on deeper links between companies and industries or simply random, lacking any underlying economic or financial rationale. We aim to identify the relevant and meaningful correlations within historical financial time-series and use these insights to generate synthetic data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Point process approach</head><p>Point process models are stochastic processes that successfully model event sequences <ref type="bibr" target="#b23">[25]</ref>. They vary depending on the definition of the conditional intensity function, representing the expected number of events in a small time interval given the event history. We consider a multi-dimensional Hawkes (self-exciting) process <ref type="bibr" target="#b24">[26]</ref> to model dependencies between various time-series of markets and obtain the influence graph. We convert financial time-series data to event sequences by defining events as relatively significant changes in time-series.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Self-exciting process</head><p>This process represents the triggering effect of event history in the intensity and occurrence of future events, usually as an exponential exciting kernel (Eq. 1) <ref type="bibr" target="#b24">[26]</ref>:</p><formula xml:id="formula_0">𝜆 𝑘 (𝑡) = 𝜇 𝑘 + ∑︁ 𝑖;𝑡 𝑖 &lt;𝑡 𝑎 𝑘 𝑖 ,𝑘 • 𝑒 −𝛽•(𝑡−𝑡 𝑖 ) ,<label>(1)</label></formula><p>where 𝜆 𝑘 (𝑡) is the intensity of event type 𝑘 at time t, 𝜇 𝑘 is the base intensity for even type 𝑘, 𝛽 is the excitation decay rate and 𝑎 𝑘 𝑖 ,𝑘 is the influence parameter between event types of 𝑘 𝑖 and 𝑘.</p><p>We infer model parameters (𝜇 and influence matrix 𝐴 = (𝑎 𝑖,𝑗 )) by maximizing the log-likelihood of events given in Eq. 2 using the expectation-maximization or stochastic gradient descent methods:</p><formula xml:id="formula_1">ℒ = ∑︁ 𝑖:𝑡 𝑖 &lt;𝑇 log 𝜆 𝑘 𝑖 (𝑡 𝑖 ) − ∫︁ 𝑇 0 𝜆(𝑡)𝑑𝑡,<label>(2)</label></formula><p>where [0,𝑇 ] is the time interval of events we consider for correlation analysis, and 𝜆(𝑡) is the sum of intensities of all event types.</p><p>Stocks' correlation analysis within a given exchange (market) We consider different point processes for the time-series of stocks and obtain the correlation graph among them by analyzing event data and inferring the influence matrix of the multi-dimensional process <ref type="bibr" target="#b25">[27]</ref>. This matrix reveals the weighted dependency (influence) graph among different market' stocks. Due to the ever-changing dependencies among companies, we can update this graph over time and drive a dynamic dependency. As a remedy for high dimensional market data, which decreases the accuracy of these models, we can drive an initial dependency graph by processing various textual data using NLP techniques.</p><p>Correlation analysis of internal stock data Despite the expressiveness of point process models, they are inaccurate in high-dimensional spaces. In correlation analysis of internal stock data (fundamental and technical), we can mitigate this problem by leveraging KGs and considering only relations appearing in the KG.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Formulating intensity function using neural networks</head><p>In addition to high dimensionality, the assumption of solid intensity functions such as the formulation in Eq. 1 with predefined triggering effect (positive and additive effect of history), might not apply to every real scenario with intricate dependencies. Therefore, several methods benefit from neural networks such as LSTMs in <ref type="bibr" target="#b26">[28]</ref> to formulate the intensity function of the point process models that capture variable and complex dependencies and adjust the model for the specific use case. Additionally, the attention mechanism <ref type="bibr" target="#b27">[29]</ref>, which can explicitly model the influence of event types, guides in finding the correlation graph.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">Spurious correlations</head><p>Every correlation does not represent a causality relationship known as spurious correlation <ref type="bibr" target="#b28">[30]</ref>. This correlation can be a random effect or caused by other hidden variables and, therefore, be spurious <ref type="bibr" target="#b28">[30]</ref>.</p><p>In particular, deep learning models usually result in spurious correlations without enough diverse data <ref type="bibr" target="#b29">[31]</ref>. To mitigate this misinterpretation, we will apply methods to test the correlations.</p><p>• Considering other factors in the stock market in correlation analysis, in addition to stocks, as much as possible. • Considering long-term correlations among time-series or comparing these correlations in the long term to test if they are not random. • Applying null hypothesis for finding significant p-values, such as the method of <ref type="bibr" target="#b30">[32]</ref>, which constructs a spurious relationship graph among time-series of stocks using Granger causal relation test to evaluate causality between time-series and T-test to estimate the p-value.</p><p>We aim to prune the initial dependency graph and reduce the dimensionality of point process models by detecting spurious correlations and driving a more meaningful model based on causality dependencies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Modeling Synthetic Financial time-series</head><p>An important input in modeling synthetic time-series is using a correlation graph among stocks, as presented in Figure <ref type="figure" target="#fig_4">4</ref>. The temporal and dependency features of stocks are embedded using this correlation graph. These features serve as inputs for ML models to generate time-series data. In the following sections, we propose various types of these models:</p><p>Improving event-based models using GNNs and LSTM The correlation graph can be used directly for generating time-series. However, to enhance the initial point process models for modeling time-series (that consist of more details than event sequences) and to improve graph weights, we will add GNN with attention layers <ref type="bibr" target="#b31">[33]</ref>. This model can simultaneously encode time-series and correlation graph structure and obtains graph weights <ref type="bibr" target="#b18">[20]</ref> (Figure <ref type="figure" target="#fig_4">4</ref>). Then, we will model the synthetic time-series data using an LSTM given the encoded data <ref type="bibr" target="#b18">[20]</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Generating synthetic time-series using correlation graphs</head><p>The other method combines the correlation graph and GANs to generate interrelated time-series data. We will apply this graph to embed time-series data, capturing their relationships and providing inputs for the GAN model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Implementation Considerations</head><p>Generating extreme volumes of synthetic financial time-series data has unique challenges and requirements from both software and hardware perspectives. Here, we outline the most relevant ones.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Scalable data generation</head><p>The five tools of the Graph-Massivizer platform are being implemented scalable to generate synthetic time-series data across a distributed system. One strategy employed is tasklevel parallelism, which divides the data generation process into small chunks processed simultaneously across different nodes in the HPC cluster. As a computing framework, Apache Spark provides libraries and functionalities for parallel processing and data management on large clusters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data storage and streaming</head><p>Managing and storing the produced synthetic data at the petabyte level requires various solutions, such as compression, third-party storage, and transferring data only when necessary. Ideally, the synthetic data should remain within the HPC cluster, with financial simulations and testing conducted in the same environment to minimize data movement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Parallel processing of data generation and large memory capacity</head><p>We avail CINECA's Leonardo Pre-exascale supercomputer <ref type="bibr" target="#b32">[34]</ref> that will allow the parallel processing of synthetic data generation. A low-latency, high-bandwidth network is in place for communication between compute nodes within the HPC cluster to facilitate data exchange Energy consumption monitoring The Graph-Massivizer platform has a dedicated tool called 'Graph-Greenifier' to monitor and analyze the energy consumption used for generating the synthetic data time-series.</p><p>Data security Even though the underlying historical financial time-series data does not contain personally identifiable information, its synthetic data can still be commercially sensitive. Therefore, security measures are necessary to ensure data confidentiality, data integrity, and secure coding practices, such as strict access control, synthetic data encryption at rest and in transit, and digital signatures of the synthetic data to ensure its authenticity and prevent tampering and data logging monitoring.</p><p>Cost optimization Finally, cost optimization in generating synthetic data is critical to the entire process. It involves balancing hardware costs, licensing fees, and ongoing maintenance expenses with the desired performance and scalability. The goal is to offer a more cost-effective solution than real historical financial data while maintaining competitiveness in pricing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9.">Conclusions</head><p>The work presented in this paper, as part of the Graph-Massivizer EU project, is currently at the halfway point and demonstrates the initial proof-of-concept developments for generating synthetic data in extreme volumes. The mechanisms and approaches identified here will be further implemented as a robust workflow within the five tools of the Graph-Massivizer platform. The focus will be on software implementation, integrating the five tools, and identifying relevant correlations in historical time-series to enhance the quality of the generated synthetic data. After generating synthetic data samples, they will be tested using Peracton's back-testing engine with various investment and trading financial algorithms. The behavior of these algorithms will be analyzed and compared with their performance on real historical data to assess differences and similarities. Feedback will be provided to the Graph-Massivizer platform to fine-tune the quality of the synthetic data further.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Graph-Massivizer architecture.</figDesc><graphic coords="3,114.44,65.59,366.39,631.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Financial ontology diagram.</figDesc><graphic coords="5,72.00,65.60,451.28,161.29" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Synthetic financial ontology diagram.</figDesc><graphic coords="6,72.00,65.61,451.27,307.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Multi-dimensional synthetic time-series model.</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">More information available at: https://graph-massivizer.eu/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgement</head><p>The Graph-Massivizer project has received funding from the European Union's Horizon Research and Innovation Actions under Grant Agreement Nº 101093202. <ref type="bibr" target="#b0">1</ref> </p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">U H R</forename></persName>
		</author>
		<ptr target="https://graph-massivizer.eu/" />
		<title level="m">Innovation Actions, Graph massivizer</title>
				<editor>
			<persName><forename type="first">G</forename><forename type="middle">A N</forename></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>Graph-Massivizer EU Project</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">U H R</forename></persName>
		</author>
		<ptr target="https://graph-massivizer.eu/project/green-and-sustainable-finance/" />
		<title level="m">Innovation Actions, Use case 1 green-finance</title>
				<editor>
			<persName><forename type="first">G</forename><forename type="middle">A N</forename></persName>
		</editor>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>Graph-Massivizer EU Project</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Generation of realistic synthetic financial time-series</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dogariu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L.-D</forename><surname>Ştefan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">A</forename><surname>Boteanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Lamba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ionescu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="1" to="27" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Generating synthetic data in finance: opportunities, challenges and pitfalls</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Assefa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dervovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mahfouz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">E</forename><surname>Tillman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Reddy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Veloso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First ACM International Conference on AI in Finance</title>
				<meeting>the First ACM International Conference on AI in Finance</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Evaluation is key: a survey on evaluation measures for synthetic time series</title>
		<author>
			<persName><forename type="first">M</forename><surname>Stenger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Leppich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Foster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kounev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bauer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Big Data</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page">66</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">metaphactory for massive graphs</title>
		<author>
			<persName><forename type="first">A</forename><surname>Eberhart</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Haase</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Schell</surname></persName>
		</author>
		<idno type="DOI">10.1145/3578245.3585330</idno>
		<idno>doi:10.1145/3578245.3585330</idno>
		<ptr target="https://doi.org/10.1145/3578245.3585330" />
	</analytic>
	<monogr>
		<title level="m">Companion of the 2023 ACM/SPEC International Conference on Performance Engineering, ICPE 2023</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Vieira</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Cardellini</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">D</forename><surname>Marco</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Tuma</surname></persName>
		</editor>
		<meeting><address><addrLine>Coimbra, Portugal</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2023">April 15-19, 2023. 2023</date>
			<biblScope unit="page" from="215" to="220" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Fundamental data</title>
		<ptr target="https://www.investopedia.com/terms/f/fundamentalanalysis.asp" />
		<imprint>
			<date type="published" when="2024">2024</date>
			<publisher>Investopedia</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Comparing technical and fundamental indicators in stock price forecasting</title>
		<author>
			<persName><forename type="first">E</forename><surname>Beyaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Tekiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>-J. Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Keane</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 20th international conference on high performance computing and communications; IEEE 16th international conference on smart city; IEEE 4th international conference on data science and systems (HPCC/SmartCity/DSS), IEEE</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="1607" to="1613" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<ptr target="https://www.investopedia.com/terms/e/environmental-social-and-governance-esg-criteria.asp" />
		<title level="m">Esg data</title>
				<imprint>
			<publisher>Investopedia</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Sentiment data</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Jeffriess</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename></persName>
		</author>
		<ptr target="https://www.eightcap.com/labs/exploring-the-most-common-sentiment-indicators-on-tradingview/" />
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Financial knowledge graph based financial report query system</title>
		<author>
			<persName><forename type="first">S</forename><surname>Zehra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">F M</forename><surname>Mohsin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">I</forename><surname>Jami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Siddiqui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">K</forename></persName>
		</author>
		<author>
			<persName><forename type="first">.-U.-R</forename><forename type="middle">R</forename><surname>Syed</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="69766" to="69782" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Finkg: A core financial knowledge graph for financial analysis</title>
		<author>
			<persName><forename type="first">N</forename><surname>Kertkeidkachorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Nararatwong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ichise</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 17th International Conference on Semantic Computing (ICSC), IEEE</title>
				<imprint>
			<date type="published" when="2023">2023. 2023</date>
			<biblScope unit="page" from="90" to="93" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Ontology graph embeddings and ilp for financial forecasting</title>
		<author>
			<persName><forename type="first">C</forename><surname>Erten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kazakov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Inductive Logic Programming</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="111" to="124" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Knowledge graph guided simultaneous forecasting and network learning for multivariate financial time series</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ibrahim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mazumder</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third ACM International Conference on AI in Finance</title>
				<meeting>the Third ACM International Conference on AI in Finance</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="480" to="488" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Matsunaga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Suzumura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Takahashi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1909.10660</idno>
		<title level="m">Exploring graph neural networks for stock market predictions with rolling window analysis</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Stochastic optimization for market return prediction using financial knowledge graph</title>
		<author>
			<persName><forename type="first">X</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">J</forename><surname>Mengshoel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Big Knowledge (ICBK), IEEE</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="25" to="32" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Vgc-gan: A multi-graph convolution adversarial network for stock price prediction</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Dong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Expert Systems with Applications</title>
		<imprint>
			<biblScope unit="volume">236</biblScope>
			<biblScope unit="page">121204</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Etesami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kiyavash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Singhal</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1603.04319</idno>
		<title level="m">Learning network of multivariate hawkes processes: A time series approach</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Graph-based stock correlation and prediction for high-frequency trading systems</title>
		<author>
			<persName><forename type="first">T</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">122</biblScope>
			<biblScope unit="page">108209</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Learning multivariate hawkes process via graph recurrent neural network</title>
		<author>
			<persName><forename type="first">K</forename><surname>Yoon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Im</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Choi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Jeong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Park</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining</title>
				<meeting>the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="5451" to="5462" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Stock market analysis: A review and taxonomy of prediction techniques</title>
		<author>
			<persName><forename type="first">D</forename><surname>Shah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Isah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Zulkernine</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Financial Studies</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page">26</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Applications of deep learning in stock market prediction: recent progress</title>
		<author>
			<persName><forename type="first">W</forename><surname>Jiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Expert Systems with Applications</title>
		<imprint>
			<biblScope unit="volume">184</biblScope>
			<biblScope unit="page">115537</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Generating multivariate time series with common source coordinated gan (cosci-gan)</title>
		<author>
			<persName><forename type="first">A</forename><surname>Seyfi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-F</forename><surname>Rajotte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="32777" to="32788" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">An introduction to the theory of point processes: volume I: elementary theory and methods</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Daley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vere-Jones</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2003">2003</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Spectra of some self-exciting and mutually exciting point processes</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G</forename><surname>Hawkes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Biometrika</title>
		<imprint>
			<biblScope unit="volume">58</biblScope>
			<biblScope unit="page" from="83" to="90" />
			<date type="published" when="1971">1971</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">A nonparametric em algorithm for multiscale hawkes processes</title>
		<author>
			<persName><forename type="first">E</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Mohler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of nonparametric statistics</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1" to="20" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">The neural hawkes process: A neurally self-modulating multivariate point process</title>
		<author>
			<persName><forename type="first">H</forename><surname>Mei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Eisner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Attentive neural point processes for event forecasting</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Gu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI Conference on Artificial Intelligence</title>
				<meeting>the AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="7592" to="7600" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Spurious correlation: A causal interpretation</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">A</forename><surname>Simon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American statistical Association</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="page" from="467" to="479" />
			<date type="published" when="1954">1954</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Discover and cure: Concept-aware mitigation of spurious correlation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yuksekgonul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<meeting><address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="37765" to="37786" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Dynamic relationship identification for abnormality detection on financial time series</title>
		<author>
			<persName><forename type="first">G</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Jung</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition Letters</title>
		<imprint>
			<biblScope unit="volume">145</biblScope>
			<biblScope unit="page" from="194" to="199" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Graph attention networks</title>
		<author>
			<persName><forename type="first">P</forename><surname>Velickovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Cucurull</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Casanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">stat</title>
		<imprint>
			<biblScope unit="volume">1050</biblScope>
			<biblScope unit="page" from="10" to="48550" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<title level="m" type="main">High performance computing, leonardo pre-exascale supercomputer</title>
		<author>
			<persName><surname>Cineca</surname></persName>
		</author>
		<ptr target="https://leonardo-supercomputer.cineca.eu/" />
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
