=Paper=
{{Paper
|id=Vol-2621/CIRCLE20_37
|storemode=property
|title=Event Detection and Time Series Alignment to Improve Stock Market Forecasting
|pdfUrl=https://ceur-ws.org/Vol-2621/CIRCLE20_37.pdf
|volume=Vol-2621
|authors=Elliot Maitre, Zakaria Chemli,Max Chevalier,Bernard Dousset,Jean-Philippe Gitto,Olivier Teste
|dblpUrl=https://dblp.org/rec/conf/circle/MaitreCCDGT20
}}
==Event Detection and Time Series Alignment to Improve Stock Market Forecasting==
Event detection and time series alignment to improve stock
market forecasting
Elliot Maître Zakaria Chemli Max Chevalier
Institut de Recherche en Informatique Scalian Institut de Recherche en Informatique
de Toulouse / Scalian Paris, France de Toulouse
Toulouse, France zakaria.chemli@scalian.com Toulouse, France
elliot.maitre@irit.fr max.chevalier@irit.fr
Bernard Dousset Jean-Philippe Gitto Olivier Teste
Institut de Recherche en Informatique Scalian Institut de Recherche en Informatique
de Toulouse Blagnac, France de Toulouse
Toulouse, France jean-philippe.gitto@scalian.com Toulouse, France
bernard.dousset@irit.fr olivier.teste@irit.fr
ABSTRACT time series forecasting using textual information is a challenging
Buying commodities is a critical issue for multiple industries be- research issue [30].
cause the variations of stock prices are induced not only by multiple In order to extract text data, multiple sources can be considered.
economic parameters but also by external events. Raw material An important one is micro-blogging. Several studies showed the
buyers must keep track of information in numerous fields, which predictive power of such media [23]. Sentiment analysis on Twitter
constitutes a major challenge considering the exponential growth can be helpful [2], the activity on social network can be correlated
of online data. To tackle this issue, we propose an event detec- with variation of the stock [26] and Twitter data can be used to
tion approach in order to assist them in their anticipation process. forecast polls that are then used to interpret stock variations [22].
Indeed, a lot of contextual information is contained in text and Specialized financial website, such as Seeking Alpha, where com-
exploiting it can allow one to improve its anticipation ability. Thus, munities of traders share their insights about the stock market,
we develop a framework of event detection and qualification, then also contains meaningful information for stock market forecasting
we quantify the impact of these events on stock market to help [5]. Thus, multiple sources of information like micro-blogging and
buyers in their anticipation process. In this paper, we will first intro- specialized community websites can be combined to improve stock
duce our context, then explain the scope of our work and our goals. market forecasting.
After detailing the related work, we will present our proposition, Leveraging the expertise of several buyers via multiple inter-
conclude and propose some future work possibilities. views, we observed that they base their decisions on events happen-
ing in the real world, related by newspapers and social networks.
CCS CONCEPTS Hence, given that the stock market reacts to news and events [8],
we will particularly focus on event detection in text. Indeed, some
• Information systems → Data management systems; Informa-
periods are more intense than others [4] and are considered as
tion retrieval; • Computing methodologies → Natural lan-
more important. These periods, characterized by some events, are
guage processing.
carrying more information than other periods. Being able to detect
KEYWORDS these events and quantify their impact constitute a major asset
Event detection, text analysis, nlp, neural networks, time series, for buyers and traders. It is a difficult task, as illustrated by the
commodities impact on the stock market of the Covid-19 outbreak, which was
widely discussed but largely underestimated. With adapted tools,
1 INTRODUCTION one could have anticipated this crisis and behaved accordingly in
order to mitigate the impact.
Time series play a major role in several industrial fields, such as
Our research aims at providing a tool leveraging information
energy [1], transport [29], economy [11] or finance [28]. Being able
contained in text data, especially events, in order to assist people
to accurately forecast time series is a major asset in order to an-
in their time series anticipation process, i.e. commodities buyers in
ticipate the modeled phenomenon for companies. In commodities
our context. In this paper we will focus on the event detection step.
buying, the stock market is described by time series and is par-
We will firstly introduce our general work, then we will focus on
ticularly volatile, making its forecasting both a strategical and a
the related work about event detection in text. Afterwards, we will
challenging task [7], [10]. Classic stock forecasting methods like
develop our proposal.
[15] or [24] are usually based on economical data, such as curren-
cies, indices or futures but most of them do not take into account
textual data which can contain precious information. Improving 2 OVERVIEW OF OUR PROPOSAL
"Copyright © 2020 for this paper by its authors. Use permitted under Creative Com- The task of commodities price forecasting is particularly complex
mons License Attribution 4.0 International (CC BY 4.0)." due to the tremendous amount of parameters that influence the
Maître, et al.
Figure 1: General approach
variations of the stock. To bring more contextual information to be able to recognize the word "killed" as a trigger for the event
the buyers and to our model, we want to combine time series with "Die". Currently, the state-of-the-art for this task is achieved by
text information. This is not a straightforward process and it needs using neural networks and several approaches have been proposed
to be broke down in sub-tasks. Hence, our work will be articulated on this base. Nguyen introduced in [20] a CNN-based approach
around three major steps as illustrated by Figure 1 : to detect these triggers. In [9], the authors improve this work by
(1) Time series analysis to find coherent temporal areas, adding a Bi-LSTM to the CNN in order to include sentence context
(2) Temporal event extraction, to the detection. The authors of [14] propose a self-regulating GAN
(3) Events and time-series alignment. to perform the detection. In [18], the authors include even more
context by a document-scale approach.
While these steps are mutually dependent, it is also possible to
treat them separately. Each of them constitute a scientific challenge
and thus will be developed separately [20], [9], [14], [25], [24], [15]. 3.2 Topic modeling approaches
In the rest of this paper, we will particularly focus on part (2) which
While the former approach is mostly based on semantic and syntac-
is the part we are currently working on and give insights about (3)
tic properties, topic modeling approaches are statistical approaches.
which is the next step of our work. Part (1) is currently not in the
The authors of [27] propose to use Twitter users as human sensors
scope of this work, we plan to use existing approaches to tackle
to detect in real-time earthquake occurrences. The authors are using
this issue.
keywords to detect these target events and they use probabilistic
models to detect the location of the events. Weng et al., in [31]
3 RELATED WORK
analyze the wavelet signal of words in Tweets in order to filter triv-
There are different approaches to perform event detection in text. ial words and clusters words to detect events. In [17], the authors
The two principal are topic modeling and event trigger detection. analyze daily topics on Twitter via Latent Dirichlet Analysis (LDA)
The former is a statistical approach while the latter is based on and then determine similarity between daily topics. They detect
word classification. bumps in word usage and then clusterizes topics in "eventy topics".
The authors of [21] propose a sub-event detection technique using
3.1 Event trigger based approaches topic modeling. This technique detect sub-events linked to an event
The event trigger based approach is a classification method which and assign a label to these sub-events. In [13], the authors propose
consists in classifying words in event categories. Some words, a real-time framework to detect minor and major events on Twitter.
named trigger-words, are supposed to trigger the event in the sen- The first module of the framework detects events and then the
tence and they are carrying the meaning. Detecting and classifying second module clusterizes these events.
those words hence allow one to understand if a sentence depicts
an event. ACE 2005 [12] is the reference dataset for this task and Thus, event trigger based approaches tend to exploit the power
has been studied multiple times [20], [9], [14]. According to the of deep neural networks while topic modeling approaches are based
ACE 2005 annotation guideline, in the sentence "A police officer on frequency of words and on what is discussed on social networks.
was killed in New Jersey today", an event detection system should We argue that combining the asset of each technique could be
Event detection and time series alignment to improve stock market forecasting
an interesting objective. The power of representation brought by (1) Text data is extracted from sources previously selected by
neural network is complementary to the detection approach of buyers, such as trusted Twitter users, in order to gather text
topic modeling. written in regular English and focused on sharing important
information. Indeed, most of the content on the internet is
created by a few users.
4 OUR PROPOSAL: EVENT DETECTION (2) In order to have an exploitable event representation, we
COMBINING TOPIC MODELING AND embed the content, using word embedding and sentence
NEURAL MODELS embedding.
Several constraints, such as the influence of possibly unknown (3) The embedded content is clusterized, leveraging the amount
parameters and the real-time nature, arise from the definition of of information the embeddings bring. This can be done by
the stock market. To predict future stock, one must exploit histor- placing the embedded content on vertices of a graph and
ical data but also real-time data. Hence, our framework must be creating an edge between each vertex, weighted by the dis-
applicable to data stream such as the Twitter stream. Moreover, tance between the two embeddings. If the distance is under
some events may not be comparable to past events, so the classifi- a certain threshold, the edge is removed in order to create
cation must be able to handle and assign labels to unknown classes. clusters of related contents.
However, we do not aim at making real-time commodities trading, (4) The clusters are labelized, by determining representative
we want to assist buyers in their daily buying decisions. We only document. An example of a representative document is a
want our solution to be applicable in a real-time context, i.e. with a document with the minimum average distance with other
granularity sufficient to help buyers in their daily transactions. tweets of the cluster.
Thus, the clusters obtained are expected to be of great quality
4.1 Motivations thanks to a better representation, allowing a better identification
Topic-modeling approaches correspond to our prerequisites, but and classification of events. These detected events will have two
some of them are not adapted to data-streams or does not work usages : they will be used in the next steps in order to estimate the
with unknown classes. Recent work which satisfies our constraints variations of the times series, and they will also be given to the
fails to exploit the properties of the language and are only based buyers in order to help make their decision, alongside with our time
on a probabilistic approach linked with word apparitions. series estimation. Since the tweets are extracted from the Twitter
Neural based approaches, such as the methods used in the trigger- Stream, we will order them as their apparition order, which allows
based approaches, are powerful in order to exploit patterns dis- us to take time into account and adapt to the type of application
covered in past data. Moreover, they bring more information by we want.
leveraging semantics and syntactic information, with methods such
as word and sentence embeddings.
Our goal is to exploit these information to improve the quality 4.3 Pros and cons
of event representation. We think that these approaches are com- This methods brings more information than a regular topic model-
plementary and we assert that combining them will allow us to ing approach, leveraging the representation power of neural based
leverage the time and frequency aspect derived from topic model- approach. It allows us to consider the documents in a time-ordered
ing and the representation power of neural networks, in order to manner which is not the case in most classification problem. This
optimize event classification. make it suitable for time-based applications such as our.
However, the efficiency of such a model for unknown events
is not certain. Indeed, it is clear that neural networks sometimes
4.2 Our method fail to generalize correctly. Handling an event containing too much
To do so, we propose a novel approach based on word and sen- novelty might be misleading for some models. The time aspect may
tence embeddings. The idea behind this method is to leverage the also have some impact on the efficiency of the model.
geometric power of these methods. Using the representation ob- Moreover, neural based approaches require annotated data, which
tained, similar documents should have similar representations in is not always available, especially in context such as Twitter where
the embedding space. By comparing the distance between docu- the amount of data is huge. This problem has been considered in
ments, we will be able to create clusters of documents. Each cluster recent work, notably in [19] where the authors propose a weakly-
corresponds to an event. Some events may be related and clusters supervised approach to limit annotation time. The problem of un-
of similar events might be regrouped in an event cluster. This event known classes is not appropriately handled by these approaches.
cluster represents a class of events, such as sports events, geopoliti- Detecting novelty without labeling it could be an insight in order
cal events... Hence, unknown events can be assimilated to events to detect change in the time series, but in the mean time, we want
in the same event cluster. We will order documents by their appari- to focus on a method allowing us to label unknown events.
tion time, so we can adapt to the real-world context we want to
apply this method to, i.e. commodities stock estimation using event Thus, this method helps us bringing more information in order
detection in text data stream. to fulfill our classification objective, to adapt to our time-dependant
context however it may rise several issues that we have not ad-
Our proposition is articulated as follows: dressed yet.
Maître, et al.
Figure 2: GAN example
5 LINKING EVENTS AND TIME SERIES Its objectives is to automatically extract information from the
VARIATIONS TO ESTIMATE FUTURE TIME detected events it takes as input, and link it with the variations in
SERIES VARIATIONS the historical time series data.
Following the idea of combining time series and text, the detected
events will be fed to a generative adversarial network (GAN) along 6 CONCLUSION
with time series data, to predict expected variations of the stock Considering the constraints induced by our context, namely detect-
prices. Figure 2 illustrates the process we will describe. Our in- ing possibly unknown events in order to help buyers in their daily
tuition is that the GAN will be able to link detected events and buying decisions, we deduced that a combination of topic-modeling
variations in the time-series. A GAN is composed of two major approaches and neural based models is a promising method to com-
parts : the generator and the discriminator. The generator try to plete our task. We propose to embed content using recent models,
mimic the actual data and the discriminator tries to identify fake i.e. word and sentence embeddings, in order to produce a better
data produced by the generator. We want to produce time series clusterization leveraging the representation power of these models
estimations, so our solution is articulated as follow : the generator and therefore have a better event classification.
part of the GAN will produce time series estimations taking events
as input. The discriminator will be fed with two inputs, the actual 7 FUTURE WORK
time series and the fake time-series, which is generated by the gen- In [3], the authors temporalize word2vec to detect the mostly dis-
erator. The objective for the generator is to be able to produce time cussed topics during certain phases of the bitcoin time series. We
series estimations that are really close to reality, in order to fool the would like to transpose this idea to our context, by detecting which
discriminator. The discriminator objective is to have a maximum ac- events are activated during special phases of the commodities stock.
curacy in its task to differentiate fake and real input. Since the final Using time stamps of the documents, the idea is to determine which
output we want is a time series estimation, our general objective is clusters of events are activated during a certain period of time
to have a generator as optimized as possible. The discriminator is and link it with stock variations. If using timestamps to order doc-
only used in the training loop, in order to give feedback to gener- uments is not difficult, determining when an event is activated
ator, to train it to produce valuable output. In order to give hints brings a lot more difficulties, such as tracking event evolution and
about the future time series variations, the generator will take as detecting the end of an event. Another goal is to be able to directly
input the events we have previously detected, which are supposed link time series and event, in a similar method as [25]. Finally,
to carry information that influences these variations. By training it encoder-decoder architecture are currently revolutionising the NLP
properly, the generator will be able to extract information from the domain. We would like to be able to better represent events, lever-
events and from the feedback of the discriminator. The feedback aging the power of encoder-decoder architectures such as BERT
from the discriminator contains information about the time series, [6]. Wu et al. did something similar with news representation in
which are not directly available to the generator. Indeed, the final [32]. Indeed, transformers are able to produce quality embeddings
objective is to have a generator which is able to predict time series for both words and sentences and have proved their quality by
variations, by only exploiting the events we detect. outperforming static embedding techniques. A major drawback of
To summarize, the GAN corresponds to the event-quantifying transformer-based methods is their computation cost. Thus, the
step, and the event-time series alignment step. usage of distilled models such as TinyBERT [16] could be a solution.
Event detection and time series alignment to improve stock market forecasting
REFERENCES [21] Diogo Nolasco and Jonice Oliveira. 2019. Subevents detection through topic
[1] John Asafu-Adjaye. 2000. The Relationship between Energy Consumption, Energy modeling in social media posts. Future Generation Comp. Syst. 93 (2019), 290–
Prices and Economic Growth: Time Series Evidence from Asian Developing 303.
Countries. Energy Economics 22 (12 2000), 615–625. https://doi.org/10.1016/ [22] Brendan O’Connor, Ramnath Balasubramanyan, Bryan Routledge, and Noah
S0140-9883(00)00050-5 Smith. 2010. From Tweets to Polls: Linking Text Sentiment to Public Opinion
[2] Johan Bollen, Huina Mao, and Xiao-Jun Zeng. 2010. Twitter mood predicts the Time Series. International AAAI Conference on Weblogs and Social Media 11.
stock market. CoRR abs/1010.3003 (2010). arXiv:1010.3003 http://arxiv.org/abs/ [23] Nuno Oliveira, Paulo Cortez, and Nelson Areal. 2016. The impact of microblogging
1010.3003 data for stock market prediction: Using Twitter to predict returns, volatility,
[3] Andrew Burnie and Emine Yilmaz. 2019. An Analysis of the Change in Dis- trading volume and survey sentiment indices. Expert Systems with Applications
cussions on Social Media with Bitcoin Price. 889–892. https://doi.org/10.1145/ 73 (12 2016). https://doi.org/10.1016/j.eswa.2016.12.036
3331184.3331304 [24] Ping-Feng Pai and Chih-Sheng Lin. 2005. A hybrid ARIMA and support vector
[4] Patrick Champagne. 2000. L’événement comme enjeu. (2000). https://doi.org/ machines model in stock price forecasting. Omega 33 (12 2005), 497–505. https:
10.3406/reso.2000.2231 //doi.org/10.1016/j.omega.2004.07.024
[5] Hailiang Chen, Prabuddha De, Yu Hu, and Byoung-Hyoun Hwang. 2013. Wisdom [25] Filipe Rodrigues, Ioulia Markou, and Francisco Pereira. 2018. Combining time-
of Crowds: The Value of Stock Opinions Transmitted Through Social Media. series and textual data for taxi demand prediction in event areas: A deep learning
Review of Financial Studies (12 2013). https://doi.org/10.2139/ssrn.1807265 approach. Information Fusion 49 (07 2018). https://doi.org/10.1016/j.inffus.2018.
[6] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. 07.007
BERT: Pre-training of Deep Bidirectional Transformers for Language Understand- [26] Eduardo Ruiz, Vagelis Hristidis, Carlos Castillo, Aristides Gionis, and Alejandro
ing. In Proceedings of the 2019 Conference of the North American Chapter of Jaimes. 2012. Correlating Financial Time Series with Micro-Blogging Activity.
the Association for Computational Linguistics: Human Language Technologies, WSDM 2012 - Proceedings of the 5th ACM International Conference on Web
Volume 1 (Long and Short Papers). Association for Computational Linguistics, Search and Data Mining, 513–522. https://doi.org/10.1145/2124295.2124358
Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423 [27] Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake Shakes
[7] Claude B. Erb and Campbell R. Harvey. 2006. The Strategic and Tactical Value Twitter Users: Real-Time Event Detection by Social Sensors. Proceedings of
of Commodity Futures. Financial Analysts Journal 62, 2 (2006), 69–97. https: the 19th International Conference on World Wide Web, WWW ’10, 851–860.
//doi.org/10.2469/faj.v62.n2.4084 arXiv:https://doi.org/10.2469/faj.v62.n2.4084 https://doi.org/10.1145/1772690.1772777
[8] Eugene F. Fama. 1965. The Behavior of Stock-Market Prices. The Journal of [28] Ruey S. Tsay. 2005. Analysis of financial time series (2. ed. ed.). Wiley-
Business 38, 1 (1965), 34–105. http://www.jstor.org/stable/2350752 Interscience, Hoboken, NJ. http://gso.gbv.de/DB=2.1/CMD?ACT=SRCHA&SRT=
[9] Xiaocheng Feng, Lifu Huang, Duyu Tang, Heng Ji, Bing Qin, and Ting Liu. 2016. YOP&IKT=1016&TRM=ppn+483463442&sourceid=fbw_bibsonomy
A Language-Independent Neural Network for Event Detection. In Proceedings [29] Mascha C. van der Voort, Mark Dougherty, M.S. Dougherty, and Susan Watson.
of the 54th Annual Meeting of the Association for Computational Linguistics 1996. Combining Kohonen maps with Arima time series models to forecast
(Volume 2: Short Papers). Association for Computational Linguistics, Berlin, traffic flow. Transportation research. Part C: Emerging technologies 4, 5 (1996),
Germany, 66–71. https://doi.org/10.18653/v1/P16-2011 307–318. https://doi.org/10.1016/S0968-090X(97)82903-8
[10] Gary Gereffi. 1999. International trade and industrial upgrading in the apparel [30] Baohua Wang, Hejiao Huang, and Xiaolong Wang. 2012. A novel text mining
commodity chain. Journal of International Economics 48, 1 (June 1999), 37–70. approach to financial time series forecasting. Neurocomputing 83 (04 2012),
https://ideas.repec.org/a/eee/inecon/v48y1999i1p37-70.html 136–145. https://doi.org/10.1016/j.neucom.2011.12.013
[11] Clive Granger and Paul Newbold. 1986. Forecasting Economic Time Series (2 [31] Jianshu Weng and Bu-Sung Lee. 2011. Event Detection in Twitter. https:
ed.). Elsevier. https://EconPapers.repec.org/RePEc:eee:monogr:9780122951831 //www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2767
[12] Ralph Grishman, David Westbrook, and Adam Meyers. 2005. NYU’s English [32] Chuhan Wu, Fangzhao Wu, Mingxiao An, Yongfeng Huang, and Xing Xie.
ACE 2005 system description. Proceedings of ACE 2005 Evaluation Workshop. 2019. Neural News Recommendation with Topic-Aware News Representation. In
Journal on Satisfiability 51 (01 2005). Proceedings of the 57th Annual Meeting of the Association for Computational
[13] Mahmud Hasan, Mehmet A. Orgun, and Rolf Schwitter. 2019. Real-time event Linguistics. Association for Computational Linguistics, Florence, Italy, 1154–1159.
detection from the Twitter data stream using the TwitterNews+ framework. https://doi.org/10.18653/v1/P19-1110
Information Processing and Management 56, 3 (5 2019), 1146–1165. https://doi.
org/10.1016/j.ipm.2018.03.001
[14] Yu Hong, Wenxuan Zhou, Jingli Zhang, Guodong Zhou, and Qiaoming Zhu.
2018. Self-regulation: Employing a Generative Adversarial Network to Improve
Event Detection. In Proceedings of the 56th Annual Meeting of the Association
for Computational Linguistics (Volume 1: Long Papers). Association for Compu-
tational Linguistics, Melbourne, Australia, 515–526. https://doi.org/10.18653/v1/
P18-1048
[15] Wei Huang, Yoshiteru Nakamori, and Shou-Yang Wang. 2005. Forecasting Stock
Market Movement Direction with Support Vector Machine. Comput. Oper. Res.
32, 10 (Oct. 2005), 2513–2522. https://doi.org/10.1016/j.cor.2004.03.016
[16] Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang
Wang, and Qun Liu. 2020. Tiny{BERT}: Distilling {BERT} for Natural Language
Understanding. https://openreview.net/forum?id=rJx0Q6EFPB
[17] Nathan Keane, Connie Yee, and Liang Zhou. 2015. Using Topic Modeling
and Similarity Thresholds to Detect Events. In Proceedings of the The 3rd
Workshop on EVENTS: Definition, Detection, Coreference, and Representation.
Association for Computational Linguistics, Denver, Colorado, 34–42. https:
//doi.org/10.3115/v1/W15-0805
[18] Dorian Kodelja, Romaric Besançon, and Olivier Ferret. 2019. Exploiting a More
Global Context for Event Detection Through Bootstrapping. 763–770. https:
//doi.org/10.1007/978-3-030-15712-8_51
[19] Shulin Liu, Yang Li, Feng Zhang, Tao Yang, and Xinpeng Zhou. 2019. Event
Detection without Triggers. In Proceedings of the 2019 Conference of the North
American Chapter of the Association for Computational Linguistics: Human
Language Technologies, Volume 1 (Long and Short Papers). Association for Com-
putational Linguistics, Minneapolis, Minnesota, 735–744. https://doi.org/10.
18653/v1/N19-1080
[20] Thien Huu Nguyen and Ralph Grishman. 2015. Event Detection and Do-
main Adaptation with Convolutional Neural Networks. In Proceedings of the
53rd Annual Meeting of the Association for Computational Linguistics and the
7th International Joint Conference on Natural Language Processing (Volume 2:
Short Papers). Association for Computational Linguistics, Beijing, China, 365–
371. https://doi.org/10.3115/v1/P15-2060