<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">LLM-Driven Knowledge Enhancement for Securities Index Prediction</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Zaiyuan</forename><surname>Di</surname></persName>
							<email>dizaiyuan@tongji.edu.cn</email>
							<affiliation key="aff0">
								<orgName type="institution">Tongji University</orgName>
								<address>
									<addrLine>No. 4800, Cao&apos;an highway, JiaDing District</addrLine>
									<postCode>201804</postCode>
									<settlement>Shanghai city, Shanghai</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jianting</forename><surname>Chen</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Tongji University</orgName>
								<address>
									<addrLine>No. 4800, Cao&apos;an highway, JiaDing District</addrLine>
									<postCode>201804</postCode>
									<settlement>Shanghai city, Shanghai</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yunxiao</forename><surname>Yang</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Tongji University</orgName>
								<address>
									<addrLine>No. 4800, Cao&apos;an highway, JiaDing District</addrLine>
									<postCode>201804</postCode>
									<settlement>Shanghai city, Shanghai</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ling</forename><surname>Ding</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Tongji University</orgName>
								<address>
									<addrLine>No. 4800, Cao&apos;an highway, JiaDing District</addrLine>
									<postCode>201804</postCode>
									<settlement>Shanghai city, Shanghai</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yang</forename><surname>Xiang</surname></persName>
							<email>tjdxxiangyang@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Tongji University</orgName>
								<address>
									<addrLine>No. 4800, Cao&apos;an highway, JiaDing District</addrLine>
									<postCode>201804</postCode>
									<settlement>Shanghai city, Shanghai</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">LLM-Driven Knowledge Enhancement for Securities Index Prediction</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">AA65750E114847039A305CB587A30D40</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Stock Market Prediction</term>
					<term>Large Language Models</term>
					<term>Knowledge Enhancement</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The securities market carries complex financial interactions, providing challenges to its prediction. To represent this complexity, researchers have utilized multi-source data, such as financial news and macro market indicators, for better performance. However, these efforts often ignore the internal knowledge among these data or suffer from the high cost of acquiring diverse knowledge. Thus, we propose a LLM-driven knowledge enhancement method for securities index prediction. Specifically, we collect the daily data of Shanghai Stock Exchange indexes and their related market indicators and model the internal knowledge among them as triplets. Then we leverage LLM as a knowledge base to acquire diverse knowledge efficiently. Finally, we integrate the knowledge and numeric multi-source data as a heterogeneous graph and apply a GNN model to predict the trend of securities indexes. Experiments demonstrate the effectiveness of our method in prediction and real-world backtest.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Securities prediction has always been a challenging but engaging task. Many studies <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5]</ref> have already proven the feasibility of predicting securities based on historical price data. These efforts have further made securities prediction a promising profit opportunity for investors and a classic time series prediction task with valuable research significance.</p><p>The securities market is a complex environment with various entities and events, often requiring time for their interactions to be fully reflected in price data. For example, the management change information in the announcement will affect its stock price, and the crude oil price will affect the stock price of airline companies. These facts inspire researchers to utilize external information for more accurate price prediction. Thus, multi-source data, such as sentiment information <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8]</ref>, semantic information from news and posts <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b10">11]</ref>, and information on companies related to the predicted stock <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref>, has been widely explored.</p><p>Substantial efforts have been made to leverage this multi-source data. Nevertheless, many treat these data in isolation, ignoring the relations' role in promoting prediction performance <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15]</ref>, for example, representing all information as concatenated vectors <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b16">17]</ref>. Compared with these methods, some work models the internal relations of data and incorporates them into predictions utilizing graph-based models, resulting in good performances. They usually represent the relations as knowledge graphs <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref> or correlation matrixes <ref type="bibr" target="#b17">[18]</ref>. However, these methods still have the following challenges: 1) Since knowledge acquisition from tremendous raw data is expensive, it is not always applicable <ref type="bibr" target="#b11">[12]</ref>. 2) Meanwhile, many rely on existing knowledge sources or correlation calculations, both suffer from the limited variety of relation and entity types <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b17">18]</ref>.</p><p>To overcome these challenges, we propose an economical and effective LLM-driven knowledge enhancement for securities index prediction. First, we collect the daily data of Shanghai Stock Exchange indexes (SSE indexes) and their related market indicators, such as the Northbound Capital and the Shenzhen Composite Index, as multi-source data to overcome the hysteresis of price data. Second, we establish the relationships between them by leveraging a large language model (LLM) as a knowledge source, thus obtaining diverse knowledge inexpensively. Third, we integrate the collected numeric data and knowledge into a heterogeneous graph. Finally, we apply a heterogeneous graph-based neural network to learn the representation of SSE indexes and predict their price movement. The experiment results demonstrate the effectiveness of our method in prediction and real-world backtest. In conclusion, our contributions are as follows:</p><p>1) We propose a LLM-driven knowledge enhancement method for securities index prediction, leveraging LLM as an automated knowledge base to acquire various knowledge efficiently.</p><p>2) We implement a heterogeneous graph-based neural network to incorporate the numeric multi-source data and their knowledge, resulting in a good improvement in the downstream prediction task.</p><p>3) We validate the effectiveness of our method in securities index trend prediction, covering 175 securities indexes and 9 years of data. Additionally, we verify the method's real-world profit capabilities through backtest.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Deep learning in stock market prediction</head><p>Deep learning has been a promising approach to stock market prediction <ref type="bibr" target="#b19">[19]</ref>. Recurrent models, such as RNN <ref type="bibr" target="#b20">[20]</ref> and LSTM <ref type="bibr" target="#b21">[21]</ref>, are particularly prominent due to their ability to capture temporal information <ref type="bibr" target="#b19">[19,</ref><ref type="bibr" target="#b22">22]</ref>. In the previous work, the majority enhance their method by incorporating multiple numerical and textual data, e.g., technical indicators <ref type="bibr" target="#b23">[23]</ref>, social text <ref type="bibr" target="#b24">[24]</ref>, and financial news <ref type="bibr" target="#b25">[25]</ref>. Nonetheless, these data often reflect a post-event status of the predicted object in isolation. Since events propagate between different entities in sequence, it causes a lead-lag effect in these data <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b26">26]</ref>.</p><p>Instead of concatenating the multi-source data as input features, some work also leverages the internal relationships of the data. For instance, Cheng et al. <ref type="bibr" target="#b11">[12]</ref> integrate multi-modal information of multiple companies through a company knowledge graph. Matsunaga et al. <ref type="bibr" target="#b12">[13]</ref> fuse price data of companies through their commercial relationships. In these works, complex relationships are reflected through knowledge, reflecting the role of knowledge in enhancing securities prediction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Knowledge enhancement</head><p>Knowledge enhancement has become increasingly crucial in stock market prediction <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b27">27,</ref><ref type="bibr" target="#b28">28,</ref><ref type="bibr" target="#b29">29]</ref>. Generally, the challenge of knowledge enhancement lies in the acquisition and incorporation of knowledge. For acquisition, many efforts have been made to obtain knowledge from unstructured raw data <ref type="bibr" target="#b30">[30,</ref><ref type="bibr" target="#b31">31]</ref> and existing knowledge sources <ref type="bibr" target="#b32">[32,</ref><ref type="bibr" target="#b33">33,</ref><ref type="bibr" target="#b34">34,</ref><ref type="bibr" target="#b35">35]</ref> such as open-source knowledge graphs. For incorporation, it is a common practice to concatenate unstructured knowledge (e.g., semantic information) and historical price data into vectors <ref type="bibr" target="#b36">[36,</ref><ref type="bibr" target="#b37">37]</ref>, making the recurrent model widely used. Besides, graph-based models are also intuitive ways to incorporate structural knowledge, such as triplets, into predictions. Among these methods, homogeneous graph-based <ref type="bibr" target="#b38">[38]</ref> and heterogeneous <ref type="bibr" target="#b39">[39,</ref><ref type="bibr" target="#b40">40]</ref> graph-based neural networks are both widely used.</p><p>However, the primary methods are confronted with acquisition costs and knowledge diversity. Recently, large language models have demonstrated their potential for knowledge acquisition, engaging many researchers in information extraction <ref type="bibr" target="#b41">[41,</ref><ref type="bibr" target="#b42">42,</ref><ref type="bibr" target="#b43">43]</ref>. In this paper, we keep further focusing on reducing the cost of knowledge acquisition by using a large language model and integrating semantic and structural information in knowledge and numerical data with a graph-based model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Method</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Problem description</head><p>Our objective is to predict the daily trend of SSE indexes, framing it as a binary classification task. Assuming today is day 𝑡, we use the average closing price over the past 𝛼 days as the baseline. If the closing price on day 𝛽 in the future exceeds the baseline, it is considered an rising trend and labeled as a positive sample. The formula for setting the labels is as follows:</p><formula xml:id="formula_0">𝑙 𝑡 = {︃ 1, 𝑐 𝑡+𝛽 &gt; 𝜆 𝛼 ∑︀ 𝛼−1 𝑖=0 𝑐 𝑡−𝑖 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒<label>(1)</label></formula><p>where 𝑐 𝑡 represent the closing price on day 𝑡, and 𝜆 ≥ 1 is a constant that determines the threshold for classifying the trend. Using the average as a baseline can avoid the impact of market fluctuations. 𝛽 represents the prediction horizon.</p><p>The data used for trend prediction comes from multiple sources. In addition to SSE indexes trading data 𝐷 𝑆𝑆𝐸 , it also includes other data sources 𝐷 1 , 𝐷 2 , • • • , 𝐷 𝑟 . These data sources have varying dimensions and granularity. In subsequent sections, we will introduce how to organize these data sources into input data for the prediction task. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">LLM-driven knowledge enhancement</head><p>Generally, the data used for SSE index prediction comes from two sources: SSE index trading data and other market indicators, as shown in the Figure <ref type="figure" target="#fig_0">1</ref>. Trading data has a direct relationship with the SSE indexes, but the available historical data often exhibits lagging characteristics. On the other hand, market indicators refer to other economic market indicators related to the SSE indexes. These indicators can reflect more environmental factors and provide guidance for predicting the trends of SSE indexes. Traditional methods typically fuse these two types of data by simply appending market indicators to the daily trading data, thereby expanding the input feature dimensions. However, they overlooks the semantic information and connections behind the market indicators, which are crucial for improving prediction performance.</p><p>We propose a knowledge enhancement method based on Large Language Models (LLM), leveraging LLMs to automatically embed knowledge into market indicator data. We argue that the knowledge of market indicators is reflected in the relationships between them. For example, there is a connection between "Southbound Capital (SC)" and the "Hang Seng Index (HSI)". SC reflects the capital flowing from mainland China into Hong Kong, while HSI is an index of the Hong Kong stock market influenced by SC. To characterize the relationships between market indicators, we adopt a approach that establishes relational paths formed by multiple triplet links. In the example above, we take SC and HSI as nodes and form a triplet link based on their relationship: [(SC, index status, Capital from Mainland), (Capital from Mainland, participate in, Hong Kong market), (HSI, index status, Hong Kong market)].</p><p>To extract relationships among numerous market indicators, we leverage a LLM to automatically uncover the connections between market indicators. The specific process is illustrated in the following steps: Step 1: Instruction Construction. We construct instructions with known nodes for input into the LLM. These instructions involve the task description, ontology constraints, output format, examples, and other relevant information. The instructions must enable the LLM to fully understand our task of extracting relationships between market indicators.</p><p>Step 2: LLM Interaction. We input the constructed instructions into the LLM and obtain the feedback results. We then validate these results and extract triplets from the successful outcomes.</p><p>Step 3: Node Update. After saving the newly generated triplets, we update the known node pool with the newly generated nodes. We repeat Step 1 until the maximum number of iterations is reached.</p><p>Step 4: Path Identification. Once the maximum number of iterations is reached, we identify paths from any two market indicator nodes as start and end points using the triplet set. Irrelevant triplets that are not on the identified paths are removed.</p><p>The iterative process generates intermediate nodes for market indicators, fully utilizing the knowledge base and reasoning capabilities of the LLM to uncover complex multi-hop and cross paths. After this process, multiple market indicators, including SSE indexes, form a connected heterogeneous graph 𝐺 = (𝑉, 𝐸). 𝑆𝑆𝐸 . These attribute values reflect the specific state of the nodes at the target time. Differences in these states directly influence the prediction results.</p><p>The intermediate nodes 𝑉 𝑝 do not have specific numerical values but possess clear semantics. Therefore, we number all intermediate nodes and use one-hot encoding as their attributes, i.e., 𝒜(𝑛 𝑖 , 𝑡) = 𝑜𝑛𝑒ℎ𝑜𝑡(𝑛 𝑖 , |𝑉 𝑝 |). These intermediate node encodings reflect the path semantics, guiding the information aggregation between market indicators. The attributes of all nodes form the attribute set 𝐴 𝑇 . This graph, enriched with knowledge for the prediction 𝑙 𝑡 , is then input into the model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">GNN-based securities index prediction</head><p>Based on the graph data 𝐺 = (𝑉, 𝐸, 𝐴) 1 , we then construct a GNN model 𝜑 𝑔𝑛𝑛 (𝐺) to predict the trend of the index. The feedforward computation process of this model, as illustrated in Figure <ref type="figure" target="#fig_3">3</ref>, consists of three main components: feature mapping, feature fusion, and classification output.</p><p>Feature mapping aims to map different types of nodes into a unified vector space. In the input heterogeneous graph, the node set 𝑉 contains 𝑘 types of nodes, and each type of node corresponds to a fixed set of feature attributions. We prepare a feature mapping function for each type of node. Given a node 𝑛 𝑖 of type 𝑗 and its associated feature values a 𝑖 , the computation of feature mapping is as follows: </p><formula xml:id="formula_1">x 𝑖 = 𝑓 𝑗 (a 𝑖 ) = 𝑊 𝑗 a 𝑖 + b 𝑗 ,<label>(2)</label></formula><p>where 𝑊 𝑗 and b 𝑗 represent the weights and bias, respectively. The vector x 𝑖 denotes the feature vector of node 𝑛 𝑖 .</p><p>Feature fusion is the process of using graph neural networks to aggregate information from nodes and edges. The feature mapping step outputs vector representations for all nodes in the graph, denoted as 𝑋 = {x 𝑠 , x 1 , • • • , x 𝑛−1 }. We employ a Heterogeneous Graph Transformer (HGT) model to learn the representations of nodes and the topological structure. The fusion vector representation of node 𝑛 𝑖 is denoted as ℎ 𝑙 𝑖 , where 𝑙 indicates the layer number of HGT, and the initial fusion vector is h 0 𝑖 = x 𝑖 . The entire network consists of 𝑚 layers, and the feedforward process for each layer is calculated as:</p><formula xml:id="formula_2">h 𝑙+1 𝑖 = 𝜑 ℎ𝑔𝑡 (h 𝑙+1 𝑖 , {(h 𝑙 𝑗 , 𝑒 𝑖𝑗 )|𝑛 𝑗 ∈ 𝒩 (𝑛 𝑖 )}),<label>(3)</label></formula><p>where 𝒩 (𝑛 𝑖 ) represents the neighbor nodes of 𝑛 𝑖 , and 𝑒 𝑖𝑗 represents the edge between nodes 𝑛 𝑖 and 𝑛 𝑗 .</p><p>The HGT model, based on the Transformer architecture, can aggregate features from different types of nodes and edges. The vector representation of SSE index nodes includes not only the node's original features but also additional data features and the semantic knowledge underlying the data.</p><p>Classification output predicts the trend of the SSE index based on the high-level representation. Given the fusion feature representation ℎ 𝑚 𝑠 of the SSE index node, we use a fully connected neural network and the softmax function to calculate the probability of the index rising or falling. This computation is defined as:</p><formula xml:id="formula_3">𝑝 = Softmax(𝑊 h 𝑚 𝑠 + b).<label>(4)</label></formula><p>The model output 𝑝 is a 2-dimensional vector, with each element corresponding to the probabilities of the index rising and falling, respectively. During model training, we utilize the cross-entropy function as the loss function and update all learning parameters through gradient descent.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiment</head><p>In this section, we prepare model prediction and market backtesting experiments to validate our method. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Experiment setting</head><p>Datasets. We collect the price and trading data of 175 SSE indexes from 2013 to 2021, along with 6 technical indicators and 12 market indicators. With the collected data, 364,314 prediction samples are formed. We split the datasets and conduct the experiments by year. In the prediction experiment, we adopt a time-series cross-validation,as referenced in <ref type="bibr" target="#b44">[44]</ref>. The samples of each year is split into five parts: Jan to Apr, May to Jun, Jul to Aug, Sep to Oct, and Nov to Dec. For the 𝑖-th fold validation, we take the first 𝑖 parts as a training set and the 𝑖 + 1-th part as a validation set. In the backtesting experiment, we use the samples from Jan to Oct as a training set and those from Nov to Dec as a test set.</p><p>Evaluation Metrics. In the prediction experiment, we apply the accuracy (Acc), precision (P), recall (R), and Macro-F1 score (F1) as metrics to evaluate the prediction performance of different models. In the backtesting experiment, we define the daily return (𝑅) as follows:</p><formula xml:id="formula_4">𝑅 𝑡 = 1 |𝐼 𝑡 | ∑︁ 𝑖∈𝐼 𝑡 𝑃 𝑡 𝑖 − 𝑃 𝑡−1 𝑖 𝑃 𝑡−1 𝑖 ,<label>(5)</label></formula><p>where 𝑃 𝑡 𝑖 denotes the price of index 𝑖 at time 𝑡, and 𝐼 𝑡 denotes the number of indexes in the portfolio held by the model at time 𝑡. We then use the average daily return (DR) and the Sharpe ratio (SR) to measure the profitability of different models. In the Sharpe ratio, we use the 1-year China Government Bond Yield as the reference for the risk-free rate. To align with the daily return 𝑅, the annual Bond Yield is divided by 365.</p><p>Implementation Details. For the label setting, we take 𝛼 = 5, 𝛽 = 2, and 𝜆 = 1.01 as specified in Eq. 1. The heterogeneous graph generated by the LLM consists of 30 nodes and 56 edges, with the time interval for node features set to 𝛾 = 5. During training, we use the Adam optimizer with a batch size of 512 and a learning rate of 0.001, and we implement early stopping to prevent overfitting. In our GNN model, the embedded vector dimension for feature mapping is set to 60, the hidden vector dimension of the HGT is 90, the number of attention heads is 4, and the number of layers is 𝑚 = 6. For the baseline settings, recurrent models have 4 layers with a hidden vector dimension of 360; graph-based models use an embedded vector dimension of 100, a hidden vector dimension of 120, 3 attention heads, and 4 layers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Prediction experiment</head><p>We compare our model, Knowledge-enhanced HGT (KHGT), with several baselines, including recurrent models such as LSTM and Bi-LSTM, and graph-based models such as GAT and HGT. Their average performance over 9 years is presented in Table <ref type="table" target="#tab_0">1</ref>. The results indicate that KHGT outperforms all baselines. Furthermore, yearly comparison results are illustrated in Figure <ref type="figure" target="#fig_4">4</ref>, showing that KHGT demonstrates strong generalization, achieving the highest Macro-F1 score and accuracy in most years from 2013 to 2021.</p><p>By comparing graph-based models and recurrent models, we observe that graph-based models generally do not outperform recurrent models. Specifically, GAT lags behind other baselines in terms of recall and F1 score, and HGT is slightly inferior to Bi-LSTM in terms of accuracy, recall, and F1 score. This indicates  that the structural in GAT and HGT does not provide a significant predictive enhancement. The superior performance of KHGT is attributed to the contribution of knowledge enhancement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Backtesting experiment</head><p>To assess the real-world profitability of KHGT, we backtest it over a period of 9 years. We adopt a straightforward and effective trading strategy, as referenced in <ref type="bibr" target="#b11">[12]</ref>: buy when the prediction is "rise" and sell when the prediction is "fall". The trading volume is set proportional to the prediction probability and inversely proportional to the index price. For simplicity, we assume that the index is tradable and ignore  <ref type="table" target="#tab_1">2</ref> presents the average performance of each model, where "Market" refers to always holding all indexes. The results demonstrate that KHGT outperforms all baselines across both metrics. For the baselines, graph-based models generally lag behind recurrent models, consistent with the findings from the prediction experiment. Notably, the profitability of KHGT remains optimal across both metrics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Effectiveness of knowledge</head><p>Intervention experiment. To further verify the effectiveness of knowledge enhancement, we conducted an intervention experiment. We randomly deleted a portion of the knowledge paths between market indicators. A deletion proportion of 1 indicates no knowledge enhancement. For each proportion setting, we randomly selected paths and repeated the experiment three times, reporting the average results.</p><p>Table <ref type="table" target="#tab_2">3</ref> presents the model performance under different proportions. The performance decreases as the number of knowledge paths increases. When eliminating all knowledge, its performance will degrade to the same level as HGT. This result implies that the LLM's knowledge has a strong enhancement effort. Without domain experts, our method can fully exploit the relationships between market indicators, thereby enhancing the predictive ability of the model in an economical and effective manner.  We observe that "index status of" and "participate in" play relatively significant roles in falling and rising markets, respectively. "Index status of" connects indicators to financial objects, reflecting the impact of market indicators on the objects they measure. "participate in" connects these financial objects to markets, reflecting the convergence and interaction of these objects. The prominence of these two relations suggests that the model tends to focus on changes in specific indicators during falling markets and the interplay of multiple indicators during rising markets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>This work proposes an LLM-driven knowledge enhancement method for the task of securities index prediction. The innovation of this method lies in leveraging the rich knowledge within LLM, significantly reducing the cost and improving the efficiency of acquiring market index-related knowledge. By interacting with LLM through instructions, we establish triplet paths among market indices, thereby forming graph data that embodies implicit knowledge. Utilizing the graph-structured market data, we construct a GNN-based prediction model, which significantly outperforms traditional models.</p><p>Limitations and future work. The quality of the knowledge obtained from the LLM depends on itself and the design of the instructions. Knowledge quality is a crucial constraint affecting the model's predictive capability. Therefore, ensuring the quality of the knowledge is a critical issue that needs to be addressed in future work.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The data source of the SSE index prediction task</figDesc><graphic coords="3,128.41,243.21,338.47,168.31" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The process of automatically uncovering relational paths between market indicator nodes using a LLM</figDesc><graphic coords="4,72.00,65.61,451.28,95.44" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>Furthermore, we align data with knowledge to form the input for the SSE index prediction model. Assuming the prediction model aims to forecast the trend on day 𝑡, we organize the input data into a graph structure 𝐺 𝑡 = (𝑉, 𝐸, 𝐴 𝑡 ). The node set 𝑉 is divided into two categories based on their sources: market indicator nodes 𝑉 𝑚 , which include SSE indexes among others, and intermediate nodes 𝑉 𝑝 , which generated by the LLM and reflect the relationships between market indicators. The market indicator nodes have corresponding data values as predictive support, serving as node feature attributes 𝒜(𝑛 𝑖 , 𝑡), where 𝑛 𝑖 ∈ 𝑉 𝑚 . For instance, the SSE indexes node uses trading data from the period [𝑡 − 𝛾, 𝑡] as its attribute values 𝒜(𝑛 𝑖 , 𝑡) = 𝐷 [𝑡−𝛾,𝑡]</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: The model structure of securities index prediction</figDesc><graphic coords="5,162.25,65.61,270.77,232.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Performance comparison of models from 2013 to 2021</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Visualization of attention weights</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Results of Prediction Experiment 2</figDesc><table><row><cell>Model</cell><cell>Acc</cell><cell>P</cell><cell>R</cell><cell>F1</cell></row><row><cell>LSTM</cell><cell>0.6801</cell><cell>0.6505</cell><cell>0.6512</cell><cell>0.6366</cell></row><row><cell>Bi-LSTM</cell><cell>0.6890</cell><cell>0.6557</cell><cell>0.6611</cell><cell>0.6465</cell></row><row><cell>GAT</cell><cell>0.6836</cell><cell>0.6587</cell><cell>0.6451</cell><cell>0.6296</cell></row><row><cell>HGT</cell><cell>0.6868</cell><cell>0.6676</cell><cell>0.6576</cell><cell>0.6432</cell></row><row><cell>KHGT</cell><cell>0.7183*</cell><cell>0.6870*</cell><cell>0.6805*</cell><cell>0.6793*</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc></figDesc><table><row><cell></cell><cell cols="2">Results of backtesting experiment</cell><cell></cell></row><row><cell>Model</cell><cell>DR</cell><cell>SR</cell><cell>F1</cell></row><row><cell>Market</cell><cell>0.0009</cell><cell>0.0683</cell><cell></cell></row><row><cell>LSTM</cell><cell>0.0019</cell><cell>0.1824</cell><cell>0.6366</cell></row><row><cell>Bi-LSTM</cell><cell>0.0022</cell><cell>0.1909</cell><cell>0.6465</cell></row><row><cell>GAT</cell><cell>0.0012</cell><cell>0.1089</cell><cell>0.6296</cell></row><row><cell>HGT</cell><cell>0.0018</cell><cell>0.1747</cell><cell>0.6432</cell></row><row><cell>KHGT</cell><cell>0.0029</cell><cell>0.2660</cell><cell>0.6793</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Results of intervention experiment</figDesc><table><row><cell>Deletion</cell><cell>Acc</cell><cell></cell><cell>F1</cell><cell></cell></row><row><cell>Ratio</cell><cell>Mean</cell><cell>Std</cell><cell>Mean</cell><cell>Std</cell></row><row><cell>1.00</cell><cell>0.6959</cell><cell>0.0056</cell><cell>0.6623</cell><cell>0.0072</cell></row><row><cell>0.75</cell><cell>0.7094</cell><cell>0.0049</cell><cell>0.6673</cell><cell>0.0084</cell></row><row><cell>0.50</cell><cell>0.7031</cell><cell>0.0123</cell><cell>0.6705</cell><cell>0.0085</cell></row><row><cell>0.25</cell><cell>0.7077</cell><cell>0.0048</cell><cell>0.6746</cell><cell>0.0035</cell></row><row><cell>0</cell><cell>0.7274*</cell><cell>0.0045</cell><cell>0.6905*</cell><cell>0.0035</cell></row><row><cell>transaction fees.</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Table</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">* denotes that the improvement is significantly in a paired t-test (𝑝 &lt; 0.05), same as below.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work is supported by the National Natural Science Foundation of China (No. 72071145).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Technical trading rules as a prior knowledge to a neural networks prediction system for the S&amp;P 500 index</title>
		<author>
			<persName><surname>Chenowethl</surname></persName>
		</author>
		<author>
			<persName><surname>Obradovic</surname></persName>
		</author>
		<author>
			<persName><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Technical Applications Conference and Workshops. Northcon/95</title>
				<imprint>
			<publisher>Citeseer</publisher>
			<date type="published" when="1995">1995</date>
			<biblScope unit="page">111</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Stock market prediction with backpropagation networks</title>
		<author>
			<persName><forename type="first">Bernd</forename><surname>Freisleben</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="1992">1992</date>
			<biblScope unit="page" from="451" to="460" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">An intelligent stock portfolio management system based on short-term trend prediction using dual-module neural networks</title>
		<author>
			<persName><forename type="first">Gia</forename><surname>Shuh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jang</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 1991 International Conference on Artificial Neural Networks</title>
				<meeting>of the 1991 International Conference on Artificial Neural Networks</meeting>
		<imprint>
			<date type="published" when="1991">1991</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="447" to="452" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Stock market index prediction using neural networks</title>
		<author>
			<persName><forename type="first">Darmadi</forename><surname>Komo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chein-I</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hanseok</forename><surname>Ko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Applications of Artificial Neural Networks V</title>
				<imprint>
			<publisher>SPIE</publisher>
			<date type="published" when="1994">1994</date>
			<biblScope unit="volume">2243</biblScope>
			<biblScope unit="page" from="516" to="526" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Stock market index prediction using deep neural network ensemble</title>
		<author>
			<persName><forename type="first">Bing</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zi-Jia</forename><surname>Gong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wenqi</forename><surname>Yang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2017 36th chinese control conference (ccc)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="3882" to="3887" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Investigating the informativeness of technical indicators and news sentiment in financial market price prediction</title>
		<author>
			<persName><forename type="first">Saeede</forename><surname>Anbaee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Farimani</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">247</biblScope>
			<biblScope unit="page">108742</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A tensor-based information framework for predicting the stock market</title>
		<author>
			<persName><forename type="first">Qing</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Information Systems (TOIS)</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">EAN: Event attention network for stock price trend prediction based on sentimental embedding</title>
		<author>
			<persName><forename type="first">Yaowei</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 10th ACM conference on web science</title>
				<meeting>the 10th ACM conference on web science</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="311" to="320" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">F-HMTC: Detecting Financial Events for Investment Decisions Based on Neural Hierarchical Multi-Label Text Classification</title>
		<author>
			<persName><forename type="first">Xin</forename><surname>Liang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IJCAI. 2020</title>
				<imprint>
			<biblScope unit="page" from="4490" to="4496" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Essential tensor learning for multimodal information-driven stock movement prediction</title>
		<author>
			<persName><forename type="first">Jun</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">262</biblScope>
			<biblScope unit="page">110262</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A self-regulated generative adversarial network for stock price movement prediction based on the historical price and tweets</title>
		<author>
			<persName><forename type="first">Hongfeng</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Donglin</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shaozi</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">247</biblScope>
			<biblScope unit="page">108712</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Financial time series forecasting with multi-modality graph neural network</title>
		<author>
			<persName><forename type="first">Dawei</forename><surname>Cheng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">121</biblScope>
			<biblScope unit="page">108218</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Exploring graph neural networks for stock market predictions with rolling window analysis</title>
		<author>
			<persName><forename type="first">Daiki</forename><surname>Matsunaga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Toyotaro</forename><surname>Suzumura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Toshihiro</forename><surname>Takahashi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1909.10660</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Deep learning for stock prediction using numerical and textual information</title>
		<author>
			<persName><forename type="first">Ryo</forename><surname>Akita</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Individualized Indicator for All: Stock-wise Technical Indicator Optimization with Stock Embedding</title>
		<author>
			<persName><forename type="first">Zhige</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</title>
				<meeting>the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="894" to="902" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Knowledge-driven stock trend prediction and explanation via temporal convolutional network</title>
		<author>
			<persName><forename type="first">Shumin</forename><surname>Deng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Companion proceedings of the 2019 world wide web conference</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="678" to="685" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Stock market forecasting using machine learning algorithms</title>
		<author>
			<persName><forename type="first">Shunrong</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Haomiao</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tongda</forename><surname>Zhang</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="1" to="5" />
			<pubPlace>Stanford, CA</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Department of Electrical Engineering, Stanford University</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Temporal and Heterogeneous Graph Neural Network for Financial Time Series Prediction</title>
		<author>
			<persName><forename type="first">Sheng</forename><surname>Xiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 31st ACM International Conference on Information amp</title>
				<meeting>the 31st ACM International Conference on Information amp</meeting>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Knowledge Management</title>
		<idno type="DOI">10.1145/3511808.3557089</idno>
		<ptr target="http://dx.doi.org/10.1145/3511808.3557089" />
	</analytic>
	<monogr>
		<title level="m">CIKM &apos;22</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2022-10">Oct. 2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Applications of deep learning in stock market prediction: recent progress</title>
		<author>
			<persName><forename type="first">Weiwei</forename><surname>Jiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Expert Systems with Applications</title>
		<imprint>
			<biblScope unit="volume">184</biblScope>
			<biblScope unit="page">115537</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Learning representations by back-propagating errors</title>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">E</forename><surname>Rumelhart</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Geoffrey</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ronald</forename><forename type="middle">J</forename><surname>Williams</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:205001834" />
	</analytic>
	<monogr>
		<title level="j">Nature</title>
		<imprint>
			<biblScope unit="volume">323</biblScope>
			<biblScope unit="page" from="533" to="536" />
			<date type="published" when="1986">1986</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Long short-term memory</title>
		<author>
			<persName><forename type="first">Sepp</forename><surname>Hochreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jürgen</forename><surname>Schmidhuber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural computation</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1735" to="1780" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Recurrent Neural Networks Approach to the Financial Forecast of Google Assets</title>
		<author>
			<persName><forename type="first">Luca</forename><surname>Di</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Persio</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Oleksandr</forename><surname>Honchar</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:67797781" />
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Ensemble Application of Transfer Learning and Sample Weighting for Stock Market Prediction</title>
		<author>
			<persName><forename type="first">Simone</forename><surname>Merello</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:203606095" />
	</analytic>
	<monogr>
		<title level="m">2019 International Joint Conference on Neural Networks (IJCNN)</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Learning Dynamic Multimodal Implicit and Explicit Networks for Multiple Financial Tasks</title>
		<author>
			<persName><forename type="first">Gary</forename><surname>Ang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ee-Peng</forename><surname>Lim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2022 IEEE International Conference on Big Data (Big Data)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Listening to Chaotic Whispers: A Deep Learning Framework for News-oriented Stock Trend Prediction</title>
		<author>
			<persName><forename type="first">Ziniu</forename><surname>Hu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining</title>
				<meeting>the Eleventh ACM International Conference on Web Search and Data Mining</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="261" to="269" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">The cross-sectional relationship between trading costs and lead/lag effects in stock &amp; option markets</title>
		<author>
			<persName><forename type="first">L O'</forename><surname>Matthew</surname></persName>
		</author>
		<author>
			<persName><surname>Connor</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Financial Review</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="95" to="117" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Knowledge graph-based event embedding framework for financial quantitative investments</title>
		<author>
			<persName><forename type="first">Dawei</forename><surname>Cheng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</title>
				<meeting>the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2221" to="2230" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Combining Enterprise Knowledge Graph and News Sentiment Analysis for Stock Price Prediction</title>
		<author>
			<persName><forename type="first">Jue</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhuocheng</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">P</forename><surname>Du</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:102352388" />
	</analytic>
	<monogr>
		<title level="m">Hawaii International Conference on System Sciences</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<title level="m" type="main">Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey</title>
		<author>
			<persName><forename type="first">Liping</forename><surname>Wang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2308.04947</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>q-fin.ST</note>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Modeling the stock relation with graph network for overnight stock movement prediction</title>
		<author>
			<persName><forename type="first">Wei</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence</title>
				<meeting>the twenty-ninth international conference on international joint conferences on artificial intelligence</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="4541" to="4547" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">Learning Embedded Representation of the Stock Correlation Matrix using Graph Machine Learning</title>
		<author>
			<persName><forename type="first">Bhaskarjit</forename><surname>Sarmah</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2207.07183</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note>q-fin.CP</note>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Temporal Relational Ranking for Stock Prediction</title>
		<author>
			<persName><forename type="first">Fuli</forename><surname>Feng</surname></persName>
		</author>
		<idno type="DOI">10.1145/3309547</idno>
		<ptr target="http://dx.doi.org/10.1145/3309547" />
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Information Systems</title>
		<idno type="ISSN">1558-2868</idno>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2019-03">Mar. 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Hypergraph-Based Reinforcement Learning for Stock Portfolio Selection</title>
		<author>
			<persName><forename type="first">Xiaojie</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICASSP43922.2022.9747138</idno>
	</analytic>
	<monogr>
		<title level="m">ICASSP 2022 -2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="4028" to="4032" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Deep attentive learning for stock movement prediction from social media text and company correlations</title>
		<author>
			<persName><forename type="first">Ramit</forename><surname>Sawhney</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
				<meeting>the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="8415" to="8426" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Time-aware Graph Relational Attention Network for Stock Recommendation</title>
		<author>
			<persName><forename type="first">Xiaoting</forename><surname>Ying</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CIKM &apos;20</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2281" to="2284" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">News-driven stock prediction via noisy equity state representation</title>
		<author>
			<persName><forename type="first">Heyan</forename><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neurocomputing</title>
		<imprint>
			<biblScope unit="volume">470</biblScope>
			<biblScope unit="page" from="66" to="75" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">Stock market prediction using machine learning classifiers and social media, news</title>
		<author>
			<persName><forename type="first">Wasiat</forename><surname>Khan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Ambient Intelligence and Humanized Computing</title>
		<imprint>
			<biblScope unit="page" from="1" to="24" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Incorporating Corporation Relationship via Graph Convolutional Neural Networks for Stock Price Prediction</title>
		<author>
			<persName><forename type="first">Yingmei</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhongyu</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xuanjing</forename><surname>Huang</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:53037746" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 27th ACM International Conference on Information and Knowledge Management</title>
				<meeting>the 27th ACM International Conference on Information and Knowledge Management</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<monogr>
		<title level="m" type="main">MDGNN: Multi-Relational Dynamic Graph Neural Network for Comprehensive and Dynamic Stock Investment Prediction</title>
		<author>
			<persName><surname>Hao Qian</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2402.06633</idno>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>q-fin.ST</note>
</biblStruct>

<biblStruct xml:id="b40">
	<analytic>
		<title level="a" type="main">REST: Relational Event-driven Stock Trend Forecasting</title>
		<author>
			<persName><forename type="first">Wentao</forename><surname>Xu</surname></persName>
		</author>
		<idno type="DOI">10.1145/3442381.3450032</idno>
		<ptr target="http://dx.doi.org/10.1145/3442381.3450032" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Web Conference 2021</title>
				<meeting>the Web Conference 2021</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2021-04">Apr. 2021</date>
		</imprint>
	</monogr>
	<note>WWW &apos;21</note>
</biblStruct>

<biblStruct xml:id="b41">
	<monogr>
		<title level="m" type="main">Event Extraction as Question Generation and Answering</title>
		<author>
			<persName><forename type="first">Di</forename><surname>Lu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2307.05567</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>cs.CL</note>
</biblStruct>

<biblStruct xml:id="b42">
	<monogr>
		<title level="m" type="main">Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction</title>
		<author>
			<persName><forename type="first">Ji</forename><surname>Qi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.13981</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>cs.CL</note>
</biblStruct>

<biblStruct xml:id="b43">
	<monogr>
		<title level="m" type="main">Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors</title>
		<author>
			<persName><forename type="first">Kai</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bernal</forename><surname>Jiménez Gutiérrez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yu</forename><surname>Su</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.11159</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>cs.CL</note>
</biblStruct>

<biblStruct xml:id="b44">
	<analytic>
		<title level="a" type="main">Evaluating time series forecasting models: an empirical study on performance estimation methods</title>
		<author>
			<persName><forename type="first">Vitor</forename><surname>Cerqueira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Luis</forename><surname>Torgo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Igor</forename><surname>Mozetič</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10994-020-05910-7</idno>
		<ptr target="http://dx.doi.org/10.1007/s10994-020-05910-7" />
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<idno type="ISSN">1573-0565</idno>
		<imprint>
			<biblScope unit="volume">109</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="1997" to="2028" />
			<date type="published" when="2020-10">Oct. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
