<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Research on DQN Stock Trading Strategy Based on Investor's Compound Sentiment 1</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ningjing Yang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Deyi Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yicheng Gong</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guici Chen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hubei Provincial Key Laboratory of Metallurgical Industry Process System Science, Wuhan University of Science and Technology</institution>
          ,
          <addr-line>Wuhan 430081</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Science College, Wuhan University of Science and Technology</institution>
          ,
          <addr-line>Wuhan 430065</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>26</fpage>
      <lpage>33</lpage>
      <abstract>
        <p>Both individual investor and institutional investor sentiment affect investors' stock trading decisions. Most existing investor sentiment analyses ignore institutional investor sentiment and treat individual investor sentiment as overall investor sentiment. Quantifying the two types of investor sentiment and expressing them as comprehensive investor sentiment, then exploring their impact on stock investment strategies are conducive to optimizing investment decisions and their investment returns. Based on individual and institutional investors in the stock market, this paper proposes a stock composite sentiment score to measure overall investor sentiment through the text mining method and VADER sentiment analysis. It constructs a DQN single stock trading model based on the composite investor sentiment score using the reinforcement learning DQN algorithm. Through experimental comparison with buy-and-hold and DQN strategies on real stock data, the results show that investor sentiment can effectively optimize stock trading strategies and improve investment returns; compared with individual investor sentiment, comprehensive investor sentiment is better optimized.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;stock trading</kwd>
        <kwd>sentiment analysis</kwd>
        <kwd>VADER</kwd>
        <kwd>reinforcement learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The efficient market hypothesis holds that: market participants are all rational economic persons, the
prices in the financial market have already reflected all market information, and rational economic
persons will make reasonable decisions based on stock prices. However, the efficient market hypothesis
cannot explain some phenomena, such as the equity premium puzzle and the herding effect. Such
phenomena suggest that investors' behavior is often influenced by imitative learning among groups and
emotional contagion and that changes in investor sentiment can also affect stock prices and trading
volumes. Therefore, studying investor sentiment is also important for studying the financial market.</p>
      <p>The measurement of investor sentiment in the financial market is an evolving topic. Initially,
researchers usually chose a single indicator to reflect investor sentiment; Fisher[1] and Schmeling[2] used
the consumer confidence index as an indicator of investor sentiment and found that investor sentiment
could predict stock market returns to a certain extent. Chinese scholars mostly use the CCTV watch
index to measure investor sentiment. Gao[3] found that investor sentiment is related to short-term market
returns. Subsequent studies have found that a single indicator is not representative enough to reflect
investor sentiment. Scholars have tried to construct investor sentiment through multiple basic indicators.
Baker[4] constructed an emotion index through principal component analysis using 6 basic indicators.
Liu[5] also used multiple market variables to constitute an investor sentiment index and showed a
positive effect of sentiment on the market reflection of surplus announcements through empirical
analysis. The sentiment indicators expressed through the composite of multiple indicators result from
the mutual equilibrium of multiple macro variables other than sentiment, which expresses the overall
market sentiment and cannot directly represent investors' sentiment.</p>
      <p>With the development of technology, more and more financial investors are gathering in financial
forums and expressing their opinions. Through text analysis, researchers can extract investor sentiment
directly from forum posts. Smailovic[6] and Bollen[7] showed that sentiment analysis of large Twitter
text datasets could effectively predict stock market movements. Li[8][9] constructed a quantitative trader
that uses publicly available online news and social media data and company-specific news sentiment
data to predict stock price movements. Picasso[10] combined sentiment and technical analysis indicators
analyzed from news articles to build a robust stock price prediction model. These studies extracted
investor sentiment directly from social media for financial markets through sentiment analysis and
achieved good results. However, these studies only focus on the sentiment of individual investors,
ignoring the sentiment of institutional investors.</p>
      <p>In recent years, deep reinforcement learning has been successfully applied to optimize stock trading
strategies and portfolio allocation. Xiong[11] used the deep deterministic strategy gradient algorithm to
conduct stock trading, which significantly improved the trading profit. Carta[12] proposed a method of
repeatedly training DQN agents to reduce strategic risk and maximize investment return. Xu[13]
proposed a deep reinforcement learning automated trading algorithm that combines CNN and
experimental results on real stock data, showing that the method significantly outperforms other
benchmark methods. Prahlad[14] proposed an adaptive deep reinforcement learning method to train
agents to allocate their portfolios. This method not only uses historical stock price data but also senses
the market sentiment of the portfolio. Experiments show that this method has a more robust investment
return than the existing baseline. Existing research shows that reinforcement learning algorithm has
been successfully applied to simulate stock trading, optimize stock trading strategies.</p>
      <p>Financial investors are mainly composed of individual and institutional investors., this paper
investigates the impact of investor sentiment on the trading strategy of single stock based on these two
types of investors. This paper uses the reinforcement learning DQN algorithm to simulate the trading
behavior of a single stock. It compares and analyzes the impact of single individual investor sentiment
and composite investor sentiment on the stock trading strategy.</p>
    </sec>
    <sec id="sec-2">
      <title>2. DQN stock trading model based on composite investor sentiment</title>
      <p>
        In order to study the influence of investors' composite sentiment on stock trading strategies and
investment returns, this paper combines sentiment analysis and reinforcement learning to construct a
DQN single stock trading model (named DQN-CE) based on investors' composite emotions.
In equation (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 
and
      </p>
      <p>respectively represent the sentiment scores of individual investors and
institutional investors of a single stock  on day  . The composite sentiment score 
overall sentiment of all investors towards stock  , which is more representative than the sentiment of a
specific type of investor. The daily composite sentiment score of single stock constitutes a sequence
represents the
{ }.</p>
      <p>
        The text of individual investor sentiment analysis for the single stock is obtained from the respective
stock forum on the Eastern Wealth website. All posts were grouped by date to calculate the sentiment
score, and the sentiment scores of posts are weighted according to the number of reads and comments
on the posts. As equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) calculates the individual investor sentiment score 
for day  .
      </p>
      <p>The model shown in Fig.1 can be divided into two modules, the investor sentiment quantification
module and the reinforcement learning stock trading module. The investor sentiment quantification
module is used to measure the overall sentiment of investors. In this module, VADER sentiment analysis
is used for the relevant texts of individual and institutional investors to calculate the individual investor
sentiment score 
and institutional investor sentiment score 
. The composite investor sentiment
score sequence { } is constructed based on the two types of investor sentiment scores. The sequence
{ } will input the reinforcement learning stock trading module with stock price data. The reinforcement
learning stock trading module simulates stock trading behavior and learns stock trading strategies.</p>
    </sec>
    <sec id="sec-3">
      <title>2.1 Investor sentiment quantification module</title>
      <p>sentiment score.</p>
      <p>VADER is based on a vast dictionary containing sentiment intensity scores for thousands of words,
punctuation marks, and web terms. The score of each word in the text is evaluated by querying the
intensity score in the dictionary. The sentiment score of the complete text can be obtained by weighting.
It usually uses a compound sentiment score to indicate the sentiment tendency of the whole text. A
compound sentiment score of -1 is the most negative, and 1 is the most positive.</p>
      <p>As shown in the quantitative investor sentiment module in Fig.1, the composite investor sentiment
is expressed as the mean of the individual investor sentiment score and the institutional investor
day t, and</p>
      <p>
        In equation (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ),
      </p>
      <p>denotes the number of stock research reports issued for single stock  on
denotes the sentiment score of the research report. The institutional investor sentiment
score indicates the sentiment tendency of all institutional investors towards stock  on day  .</p>
      <p>
        In equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), 
denotes the number of posts about stock  on day  ,  , 
denotes the number
of reads and comments on the post, respectively, and 
individual investors towards stock  on day  .
comment. The individual investor sentiment score 
shown in equation (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ).
      </p>
      <p>The text for institutional investor sentiment analysis is derived from individual stock research reports
on stock reporting websites. The complete research report is too lengthy to facilitate analysis. Therefore,
only the report titles are used for sentiment analysis to calculate the institutional sentiment score  , as
denotes the sentiment score of the stock
represents the overall sentiment tendency of

=
∑</p>
      <p>log
∑
log
+</p>
      <p>∗ 
+ 

=
∑</p>
      <p>∗</p>
    </sec>
    <sec id="sec-4">
      <title>2.2 Reinforcement learning based stock trading module</title>
      <p>In the reinforcement learning stock trading module in Fig.1, the DQN network receives information
about the stock and makes trading actions according to the trading rules. The environment and trading
rules of the real stock market are too complex for agent to get a glimpse of the whole picture. In order
to facilitate the agent to learn stock trading strategies quickly, we propose three assumptions to simplify
the stock trading environment.
to affect the stock trading environment.</p>
      <p>
        Assumption (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ): The amount of funds and stocks owned by the agent is limited, which is not enough
Assumption (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ): An agent can choose to trade once a day, choosing to buy, hold or sell the current
stock, and the stock price is based on the closing price of the day.
      </p>
      <p>
        Assumption (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ): The number of shares per trade is the number of all shares held by the agent.
      </p>
      <p>After limiting the assumptions about a single stock's trading environment and trading rules, we
formalize the stock trading process as a Markov decision process, expressed as (s, a, p, r, γ), γ represents
the discount factor.</p>
      <p>Each state s in the state set  is represented as a tuple ( , ℎ ,  ,  ). 
denotes the closing price of
stock i on day t, 
∈ 
; ℎ denotes the number of stock i held by the agent on day t,ℎ ∈ 
; 
denotes the agent's remaining cash balance, 
Ialculated by the quantitative investor sentiment module.</p>
      <p>∈ 
 is action space of an agent , 
∈ {−1,0,1}, where</p>
      <p>.  is the composite sentiment score of stock i
−1；sell
 = 0；
1；
hold
buy</p>
      <p>
        The choice of trading behavior is based on greedy rules, and the model initially sets a greedy value
ε. The program randomly generates a number that obeys the uniform distribution within the range of (
        <xref ref-type="bibr" rid="ref1">0,
1</xref>
        ) and compares it with ε. If the random number is less than the greedy value, the action with the largest
Q value of the corresponding state action among the three actions is selected as the trading action;
otherwise, any one of {- 1, 0, 1} is selected as the trading action.
      </p>
      <p>
        The reward is the immediate reward received by the agent after executing the transaction action. The
agent adjusts the selection of subsequent transaction operations according to the received reward value.
Moreover, use the accumulated rewards between all trading dates to measure the pros and cons of the
trading strategy. Considering the impact of stock price changes and investor sentiment on stock trading
strategies, the rewards of the model are defined as two parts, as shown in equation (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ).
 =


−
      </p>
      <p>
        + 0.1 ∗ 
In equation (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ),
      </p>
      <p>is the reward part of stock  .  is the closing price of the stock i on day  ,

controlled by a factor of 0.1 so that the emotional reward part of the investor and the return reward part
is the closing price on day  + 1 . 0.1</p>
      <p>
        ∗  is the sentiment reward part of the stock i, which is
have the same size. Agents can also adjust strategies through the reward of investors' sentiment
components.
action  following the policy π in the state s, which is expressed as equation (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ).
      </p>
      <p>The action value function 
(,</p>
      <p>
        ) represents the cumulative expected value obtained by selecting
 (,  ) =  ~ [ (, , 
) + 
( ,  )]
of the future return of a given state-action pair, as shown in equation (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ).
      </p>
      <p>The goal of Q-learning is to make 
(, )
maximization, i.e., maximization of the expected value</p>
      <p>
        As shown in Fig.1, the agent composed of two networks continuously interacts with the environment
to generate experience data (, , ,  ) and store it in the experience replay. When a sufficient amount
of experience data is stored in the experience replay, a small batch of data is randomly selected from
the experience replay to train the neural network. (, ; ) is the output of the main Q-network, which
is used to evaluate the value function of the current state action pair, and the parameter θ of (, ; )
is updated in real-time; (, ;  ) represents the output of the target Q- network, which is used to
calculate the objective function, as shown in equation (
        <xref ref-type="bibr" rid="ref8">8</xref>
        ).
      </p>
      <p>=  +</p>
      <p>( ,  ;  )</p>
      <p>
        Thus when the agent takes action  in the environment, the loss function L(θ) can be calculated, as
shown in equation (
        <xref ref-type="bibr" rid="ref9">9</xref>
        ).
      </p>
      <p>
        ( ) =   + 
 ( ,  ;  ) −  (, ; 
)
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
      </p>
      <p>
        According to the loss function L(θ) in equation (
        <xref ref-type="bibr" rid="ref9">9</xref>
        ) calculate the gradient of the error function for
updating the parameters θ of the main Q-network. The main Q-network copies its parameters θ to the
target Q-network  after a certain number of iterations, thus completing the learning process once.
      </p>
    </sec>
    <sec id="sec-5">
      <title>3. Experiments and results analysis</title>
    </sec>
    <sec id="sec-6">
      <title>3.1 Experimental setup</title>
      <p>The experiments used Python 3.7 to implement the algorithm model, the PyTorch function module
to train the deep learning network, and the reinforcement learning DQN agent; the python matplotlib
3.1 library to visualize the data, and the Vader function module in the NLTK library to calculate the
sentiment score. The transaction cost is set at 0.05% of the transaction amount. mini-batch size is equal
to 512 and the initial learning rate is 0.01.</p>
      <p>The experiment data comes from Yahoo Finance, and selects four stocks that have attracted much
attention in the A-share market. They are Sany, Gree Electric, China Merchants Bank and UNIS. The
data of all stocks are divided into two parts, the data from January 2018 to December 2020 are used as
the training dataset for model training (denoted as  ); the data from January 2021 to January 2022
were used as the test dataset (denoted as  ) for model back-testing.</p>
      <p>To explore the effect of investor sentiment on stock trading strategies and investment returns, a DQN
stock trading model based on individual investor sentiment (denoted as DQN-SE) and a DQN stock
trading model based on investor sentiment score (denoted as DQN-CE) are constructed, respectively.
In order to compare the effectiveness of the models, they are compared with two benchmark strategies,
the buy-and-hold strategy (denoted as B&amp;H) and the DQN algorithmic trading strategy (denoted as
DQN), and the models are evaluated using the cumulative stock returns and Sharpe ratios.</p>
    </sec>
    <sec id="sec-7">
      <title>3.2 Experimental results and analysis</title>
      <p>
        In the DQN-SE model, only the influence of individual investor sentiment on stock trading strategies
and returns is considered. The individual investor sentiment score is used as a state component to replace
the composite investor sentiment in equation (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ), with the return set to  =
+ 0.1 ∗ 
. The
models DQN-SE, DQN-CE, DQN were back-tested on the test set 
the training set  .
      </p>
      <p>after 1000 rounds of training on</p>
      <p>As can be seen in Fig.2 and Fig.3, the stock prices of Sany and GREE showed a clear downward
trend, and all four trading strategies showed losses with negative returns. Among them, the return curve
of DQN-CE trading strategy is higher than that of other strategies, with the smallest loss and the largest
loss of B&amp;H strategy.</p>
      <p>As shown in Fig.4, the overall trend of UNIS stock price is up, and all four trading strategies show
profits. The return curve for trading with the DQN-CE strategy is the highest, with a return of over 30%,
the DQN-SE strategy also has a return of over 20%, and the DQN and B&amp;H strategies both have a
return of around 10%.</p>
      <p>As shown in Fig.5, the return curves of the four trading strategies on China Merchants Bank stocks
vary widely, with the DQN-CE strategy, DQN strategy, and B&amp;H strategy showing positive returns and
the B&amp;H strategy gaining the most; trading with the DQN-SE strategy shows negative returns.</p>
      <p>Compared to the B&amp;H strategy, the three trading strategies DQN, DQN-SE, and DQN-CE had more
returns on three stocks, Sany, Gree Electric, and UNIS, but the B&amp;H strategy had the highest return
curve on the China Merchants Bank stock. China Merchants Bank's stock is a high market capitalization
stable stock, and the stock price has been showing a more stable upward trend during the training period,
showing a cyclical, more significant fluctuation during the testing period; there is an inevitable lag
between individual investors sentiment and the trend of stock price movement, so it isn't easy to profit
in the stock market.</p>
      <p>Compared to the DQN strategy model without investor sentiment analysis, the DQN-SE and
DQNCE models with investor sentiment analysis have higher return curves on all four stocks, with the
DQNCE model having the highest return curve. The results suggest that investor sentiment has an
ameliorating effect on stock investment returns and that the use of composite investor sentiment has a
more pronounced effect than individual investor sentiment.</p>
      <p>After analyzing the model's investment return performance, the performance of different trading
strategies is measured using the Sharpe ratio, which combines investment return and investment risk
and describes the excess return that an investor can earn per unit of risk taken.</p>
      <p>As can be seen from Tab.1, for the stocks of Sany and GREE, whose share prices continue to fall
and are in a loss-making state, the DQN-CE strategy reduces losses while reducing trading risk. For
UNIS, whose stock price is in an uptrend, the trading return using the DQN-CE strategy is 35.52%, and
the trading risk is also the lowest. As for China Merchants Bank, which has a volatile stock price, using
investor sentiment does not optimize the trading strategy and improve investment returns, using the
traditional B&amp;H strategy would have generated higher investment returns. Compared to the DQN-SE
strategy, the DQN-CE algorithmic strategy performed better on four stocks, with an average
improvement in returns of about 8.16%.</p>
      <p>From the experimental results, the DQN-CE strategy performs better compared to DQN-SE. The
DQN-CE strategy has better performance on stocks with a clear trend of stock price changes, so the
DQN-CE strategy is more suitable for stocks with a clear trend of movements. The DQN-CE strategy
is less applicable to stocks whose stock prices fluctuate in a range with no obvious trend.</p>
    </sec>
    <sec id="sec-8">
      <title>4. Conclusion</title>
      <p>
        This paper combines two types of investor sentiment using composite investor sentiment score to
measure investor sentiment in the stock market, simulates stock trading through the reinforcement
learning DQN algorithm, and conducts empirical analysis using real data from four stocks. The results
show that: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) for stocks with more obvious stock price change trends, investor sentiment analysis is
helpful in optimizing stock trading strategies and reducing investment risks. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Compared to single
individual investor sentiment, composite investor sentiment is more conducive to optimizing trading
strategies and improving investment returns.
      </p>
    </sec>
    <sec id="sec-9">
      <title>5. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Fisher</surname>
            <given-names>Kenneth L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Statman</surname>
            <given-names>M. Consumer</given-names>
          </string-name>
          <string-name>
            <surname>Confidence</surname>
            and
            <given-names>Stock</given-names>
          </string-name>
          <string-name>
            <surname>Returns</surname>
          </string-name>
          [J].
          <source>The Journal of Portfolio Management</source>
          ,
          <year>2003</year>
          ,
          <volume>30</volume>
          (
          <issue>1</issue>
          ):
          <fpage>115</fpage>
          -
          <lpage>127</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Schmeling</surname>
            <given-names>M. Investor</given-names>
          </string-name>
          <string-name>
            <surname>Sentiment</surname>
            and
            <given-names>Stock</given-names>
          </string-name>
          <string-name>
            <surname>Returns</surname>
          </string-name>
          : Some International Evidence[J].
          <source>Journal of Empirical Finance</source>
          ,
          <year>2009</year>
          ,
          <volume>16</volume>
          (
          <issue>3</issue>
          ):
          <fpage>394</fpage>
          -
          <lpage>408</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Gao</surname>
            <given-names>L. Equity</given-names>
          </string-name>
          <string-name>
            <surname>Transfers</surname>
            and
            <given-names>Market</given-names>
          </string-name>
          <string-name>
            <surname>Reactions</surname>
          </string-name>
          [J].
          <source>Journal of Emerging Market Finance</source>
          ,
          <year>2008</year>
          ,
          <volume>7</volume>
          (
          <issue>3</issue>
          ):
          <fpage>293</fpage>
          -
          <lpage>308</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Baker</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wurgler</surname>
            <given-names>J</given-names>
          </string-name>
          .
          <article-title>Investor Sentiment and the Cross-Section of Stock Returns[J]</article-title>
          .
          <source>The Journal of Finance</source>
          ,
          <year>2006</year>
          ,
          <volume>61</volume>
          (
          <issue>4</issue>
          ):
          <fpage>1645</fpage>
          -
          <lpage>1680</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Liu</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wen</surname>
            <given-names>YL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            <given-names>JL</given-names>
          </string-name>
          .
          <article-title>Corporate governance, investor sentiment and surplus response</article-title>
          [J].
          <source>Statistics and Decision Making</source>
          ,
          <year>2020</year>
          ,
          <volume>36</volume>
          (
          <issue>07</issue>
          ):
          <fpage>154</fpage>
          -
          <lpage>158</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Smailović</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grčar</surname>
            <given-names>M</given-names>
          </string-name>
          , et al.
          <article-title>Stream-based Active Learning for Sentiment Analysis in the Financial Domain</article-title>
          [J].
          <source>Information Sciences</source>
          ,
          <year>2014</year>
          ,
          <volume>285</volume>
          :
          <fpage>181</fpage>
          -
          <lpage>203</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Bollen</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mao</surname>
            <given-names>H</given-names>
          </string-name>
          , et al.
          <article-title>Twitter Mood Predicts the Stock Market[J]</article-title>
          .
          <source>Journal of Computational Science</source>
          ,
          <year>2011</year>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Li</surname>
            <given-names>Q</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            <given-names>Q</given-names>
          </string-name>
          , et al.
          <article-title>Media-aware Quantitative Trading Based on Public Web information</article-title>
          [J].
          <source>Decision Support Systems</source>
          ,
          <year>2014</year>
          ,
          <volume>61</volume>
          :
          <fpage>93</fpage>
          -
          <lpage>105</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Li</surname>
            <given-names>Q</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>TJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>P</given-names>
          </string-name>
          , et al.
          <source>The Effect of News and Public Mood on Stock Movements[J]. Information Sciences</source>
          ,
          <year>2014</year>
          ,
          <volume>278</volume>
          :
          <fpage>826</fpage>
          -
          <lpage>840</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Picasso</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Merello</surname>
            <given-names>S</given-names>
          </string-name>
          , et al.
          <source>Technical Analysis and Sentiment Embeddings for Market Trend Prediction[J]. Expert Systems with Applications</source>
          ,
          <year>2019</year>
          ,
          <volume>135</volume>
          :
          <fpage>60</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Xiong</surname>
            <given-names>ZR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            <given-names>XY</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhong</surname>
            <given-names>S</given-names>
          </string-name>
          , et al.
          <article-title>Practical Deep Reinforcement Learning Approach for Stock Trading</article-title>
          .[J].
          <source>CoRR</source>
          ,
          <year>2018</year>
          ,abs/
          <year>1811</year>
          .07522.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Carta</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferreira</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Podda</surname>
            <given-names>AS</given-names>
          </string-name>
          , et al.
          <article-title>Multi-DQN: An Ensemble of Deep Q-learning Agents for Stock Market Forecasting</article-title>
          [J].
          <source>Expert Systems with Applications</source>
          ,
          <year>2021</year>
          ,
          <volume>164</volume>
          :
          <fpage>113820</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Xu</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            <given-names>YK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xing</surname>
            <given-names>CH</given-names>
          </string-name>
          .
          <article-title>Research on financial trading algorithms based on deep reinforcement learning[J]</article-title>
          .
          <source>Computer Engineering and Applications</source>
          ,
          <year>2022</year>
          ,
          <volume>58</volume>
          (
          <issue>07</issue>
          ):
          <fpage>276</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Koratamaddi</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wadhwani</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gupta</surname>
            <given-names>M</given-names>
          </string-name>
          , et al.
          <article-title>Market Sentiment-aware Deep Reinforcement Learning Approach for Sock Portfolio Allocation</article-title>
          [J].
          <source>Engineering Science and Technology an International Journal</source>
          ,
          <year>2021</year>
          ,
          <volume>24</volume>
          (
          <issue>4</issue>
          ):
          <fpage>848</fpage>
          -
          <lpage>859</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>