=Paper=
{{Paper
|id=Vol-2145/p21
|storemode=property
|title=CPU and GPU Implementations for High Frequency Trading in Algorithmic Finance
|pdfUrl=https://ceur-ws.org/Vol-2145/p21.pdf
|volume=Vol-2145
|authors=Mantas Vaitonis,Saulius Masteika
}}
==CPU and GPU Implementations for High Frequency Trading in Algorithmic Finance==
CPU and GPU Implementations for High Frequency
Trading in Algorithmic Finance
Mantas Vaitonis Saulius Masteika
Vilnius University Kaunas Faculty Vilnius University Kaunas Faculty
Muitinės street. 8, Muitinės street. 8,
LT-44280 Kaunas, Lithuania LT-44280 Kaunas, Lithuania
mantas.vaitonis@knf.vu.lt Saulius.masteika@knf.vu.lt
Abstract— Today algorithmic trading and High Frequency Profit chances for high frequency traders are very time sensitive
Trading (HFT) account for a dominant part of overall trading and low latency for trade execution is of the main importance.
volume in financial markets. The trade execution time has grown Thus, HFT firms invest in hardware and high – speed
from daily trading to microseconds and nanoseconds.. A modern connections and place their trading platforms close to stock
GPU allows hundreds of operations to be performed in parallel, market servers via co-location. One of the hardware invested is
leaving the CPU free to execute other jobs. The main objective of GPU. The architectures GPU are a cost effective alternative to
this research was to test the possibility and quantify how much traditional parallel processing machines. This change ushers in
higher speedups the use of GPUs can bring in calculations of HFT a new era in computing, which allows any modern personal
statistical arbitrage algorithms. In the research MATLAB
computer to take advantage of parallel processing capabilities
software was applied for GPU application and computations. The
statistical arbitrage- pair trading algorithm was parallelized in
previously available only in specialized systems.[20]
order to adapt it to GPU application. The effectiveness was Nowadays, standard computers come with sequential CPUs
measured according to time CPU and GPU did spent working on or with multicore CPUs, which allow a limited number of
historical data using pair trading strategy. In the paper the final processes to be executed in parallel. On the other hand, the
results of the research are presented and discussed. The results importance of graphics in most application domains pushed
have proven up to 30% increase in computational speed with the industry into producing ad-hoc Graphical Processing Units
application of statistical arbitrage algorithm in HFT.
(GPUs) to relieve the main CPU from the calculations required
Keywords— high frequency trading; statistical arbitrage;
for graphics. What is important here is that this hardware is
GPU; high performance computing; parallel computing. strongly parallel and may operate independent from the main
CPU. A modern GPU, like those equipping most computers
I. INTRODUCTION today, allows hundreds of operations to be performed in parallel,
leaving the CPU free to execute other jobs. In particular, GPUs
The computational power requirements have continuously offer hundreds of processing cores, but they can be used
increased in computer science fields such as computational simultaneously only to perform data parallel computations.
physics, quantitative finance and etc. One of the examples is Moreover, GPUs usually have no direct access to the main
high-frequency trading (HFT) which is focused on automatic memory and they do not offer hardware managed caches; two
trading decisions making. All decisions to buy or to sell financial aspects that make memory management a critical factor to be
instrument are made by computer algorithms without human carefully considered. [7]
interaction. The mentioned algorithms analyze the incoming
information which is received from the exchange system. The increasing pervasivity of parallel architectures like
Information from exchange system may include new multi-/many-core CPUs and GPUs, parallel programming has
transactions taking place with their transaction prices and become not an alternative but rather a need for increasing the
volumes, but in some systems also order submission, order software performance.[2]
modification and order deletion events of other exchange Graphics processing units (GPU) offer a new possibility for
members. If a trading algorithm decides to submit a buy or sell speeding up large scale simulation of long range interacting
order to the exchange system, then within a few milliseconds systems without sacrificing accuracy. GPU is a powerful device
this information is sent from exchange member’s system to the which can process thousands of threads simultaneously with
central exchange server which is responsible for matching offer high memory bandwidth. Compared to CPU, GPU is designed
and demand. The exchange server responds with a confirmation with more transistors that are devoted to data processing rather
message. [6] than data caching and flow control. It is suitable for
The trade execution time has grown from daily trading to computation-intensive and data-parallel computations needed
microseconds and even nanoseconds. By the increase in speed a for high frequency traders that are time sensitive. [5]
huge number of orders and order cancellations are required. Multi-threaded parallel CPU implementations are expected
to run faster than the single-threaded counterparts, the overhead
Copyright held by the author(s). of creating, destroying, and synchronizing threads may be very
119
high. An alternative parallel computing platform is the GPU. intense calculations include Field-Programmable Gate Array
Originally, it was developed for graphics applications. Due to (FPGA), IBM‟s Cell Broadband Engine Architecture (Cell BE
their massive parallel processing capabilities, state-of-the-art or, simply, Cell) and Graphics Processing Units (GPUs). Until
GPUs are the leading software computing devices for the most recently GPU remained on fringes of HPC (high performance
parallel and computationally intensive applications such as high computing) mostly because of the high learning curve caused by
frequency trading algorithms. [3] the fact that low-level graphics languages were the only way to
program the GPUs. Now, however, NVIDIA has come out with
Our study demonstrates how the use of GPUs can bring a new line of graphics cards – Tesla. [6]
impressive speedups in statistical arbitrage trading algorithm,
leaving the main CPU free to focus on the remaining aspects of One of NVIDIA GPUs‟ main features is ease of
trading strategy. Several vendors have recently started offering programmability made possible with CUDA – Compute Unified
toolkits to leverage the power of GPUs for general purpose Device Architecture. CUDA provides the means to compile and
programming. Unfortunately, they introduce a totally new run code for NVIDIA‟s GPUs. With a low learning curve,
model of computation, which requires algorithms to be fully re- CUDA allows developers to tap into enormous computing
designed. In this research MATLAB was used for GPU power of GPUs yielding high performance benefits. [8] As
computing which allows to accelerate an application with GPUs mentioned in the introduction, we use the compute unified
more easily than by using C or Fortran. With the MATLAB device architecture (CUDA), which allows for implementation
language it is possible take advantage of the CUDA GPU of algorithms using MATLAB with CUDA specific extensions.
computing technology without having to learn the intricacies of Thus, CUDA issues and manages computations on a GPU as a
GPU architectures or low-level GPU computing libraries. data-parallel computing device. The graphics card architecture
used in recent GPU generations is built around a scalable array
In this paper, we investigate implementations of CPU and of streaming multiprocessors. [8] When a program using CUDA
GPU the parallel pair trading algorithm. The main aim of this extensions and running on the CPU invokes a GPU kernel,
research is to explain the improved designs in detail, and report which is a synonym for a GPU function, many copies of this
a performance comparison between CPU and GPU kernel – known as threads – are enumerated and distributed to
implementations in terms of speed. Improvements suggested in the available multiprocessors, where their execution starts. [6]
the paper for CPU and GPU implementations are summarized as
faster speed due to new memory access patterns, and more
flexibility due to a more efficient use of processors, respectively.
In order to take advantage of the CPU and GPU it is
necessary to parallelize the calculations. The effectiveness was
measured according to time CPU and GPU did spent working
on historical data using pair trading strategy. The strategy used
was first researched by D. Herlemont on his paper about pairs
trading [19]. This trading strategy was used on high frequency
data during previous researches. [24][32] However it was not
used with GPU. There are a number of functions of this trading
algorithm that can be parallelized like pair selection, trading
signal detection, trading and profit/loss calculation for each
trade. Thus, it had to be modified and parallelize in order to take
advantage of GPU. Importantly, not only pairs trading strategies,
but also the method of pairs selection is introduced in this
research.
Fig. 1. Visualization of a GPU multiprocessor with on-chip shared
Cointegration method was used for trading pairs selection. memory.Example of a figure caption. (figure caption)
The pairs selection algorithm is based on using Augmented
Dickey Fuller Test, Engle and Grangers 2-step approach and As shown in Fig. 3, each multiprocessor of the GPU device
Johansen test. [12] Finally, the comparison of statistical contains several local registers per processor, memory which is
arbitrage trading strategy is given when using CPU and later shared by all scalar processor cores in a multiprocessor. In order
with GPU. to allow for reducing the number of involved multiprocessors,
The rest of the paper is organized as follows: theory and the the slower global memory can be used, which is shared among
problem statement are presented in Sections 1 and 2, the all multiprocessors and is also accessible by the function running
methodology, including the pairs trading strategy, pairs in the CPU. Please note, that the GPU’s global memory is still
selection algorithm, speedup of an trading algorithm is presented roughly 10 times faster than current main memory of personal
in Sections 3 and 4. The results and the summary of the research, computers. However, each multiprocessor features only one
followed by conclusions in Section 5. double-precision processing core and so, the theoretical peak
performance is significantly reduced for double-precision
II. TRADING USING HARDWARE ACCELERATION operations. [8]
Hardware acceleration is achieved by utilizing specific
hardware to gain higher computational results than those
provided by general purpose CPU. Most devices intended for
120
III. STATISTICAL ARBITRAGE 20150809 17:00:00.930168164 NGF6 NG B 3221
Correlation is a statistical term that comes from linear 20150809 17:00:01.017456320 NGF6 NG A 3226
regression analysis. This term defines the strength of a 20150809 17:00:01.017456320 NGF6 NG B 3219
relationship between two variables. The main idea of statistical
20150809 17:00:01.059840559 NGF6 NG A 3227
arbitrage or pairs trading is to find the pair of financial
instruments that are highly correlated. When a pair is found, a 20150809 17:00:01.059840559 NGF6 NG B 3219
trader must look for the changes in correlation followed by mean 20150809 17:00:01.156791713 NGF6 NG A 3238
– reversion to the trend of financial instruments pair, thereby,
creating a profit opportunity. This type of trading needs to 20150809 17:00:01.156791713 NGF6 NG B 3216
identify a relationship between two financial instruments, figure 20150809 17:00:01.204683812 NGF6 NG A 3238
out the direction of their relationship, and execute long and short
positions, based on the statistical data presented. Selecting a 20150809 17:00:01.204683812 NGF6 NG B 3216
good pair for trading becomes the most important stage of mean- 20150809 17:00:01.205605232 NGF6 NG A 3238
reversion of the market-neutral statistical arbitrage
20150809 17:00:01.205605232 NGF6 NG B 3215
strategy.[26][34]
20150809 17:00:01.206755867 NGF6 NG A 3238
A. Pairs Trading Using Cointegration
20150809 17:00:01.206755867 NGF6 NG B 3215
The cointegration method uses mathematical model,
20150809 17:00:01.207350519 NGF6 NG A 3231
developed by Engle and Granger [17], which have attracted a
considerable interest of the economists over the last two 20150809 17:00:01.207350519 NGF6 NG B 3215
decades. Cointegration states that, in some instances, despite 20150809 17:00:01.208805474 NGF6 NG A 3231
two given non-stationary time series, a specific linear
combination of the two time series is actually stationary. The 20150809 17:00:01.208805474 NGF6 NG B 3217
two time series move together in a lockstep fashion. The 20150809 17:00:01.224604710 NGF6 NG A 3233
cointegration can be described like this: xt and yt are two time
series that were non-stationary. If there was parameter and the 20150809 17:00:01.224604710 NGF6 NG B 3217
following equation:
zt=yt-xt (1) The cointegration method uses mathematical model,
was a stationary process, then xt and yt would be developed
cointegrated. This path-breaking process emerged as a powerful IV. 3. METHODOLOGY
tool for investigating common asset trends in multivariate time
series. [25] The main purpose of pairs trading is to find two financial
instruments that move together. Once the pair of these
B. Data instruments is found, strategy has to decide when to take long
The microsecond data for this research was provided by and short positions based on the trading rules. Following the
Nanotick company. Futures contract data is from ME group research, six main steps of pairs trading strategy were identified:
which consists of NYMEX, COMEX and CBOT. Nanotick 1. Selection of the size of the window trading and data
provided five different futures commodity contracts: NG normalization;
(natural gas), BZ (Brent crude oil), CL (crude oil), HO (NY
Harbor ULSD) , RB (RBOB Gasoline). Time period of 2. Data normalization;
commodity futures contracts was from 01-08-2015 to 31-08- 3. Selection of the correlated pair;
2015.
4. Definition of the trading rules;
After normalization, microsecond futures commodity
contracts data consisted of 24957994 records. Upon preparation, 5. Trading;
the data had to be applied to statistical arbitrage trading strategy.
6. Assessment of the pairs trading strategy.[16][24][32]
Before selecting trading and data normalization window,
strategy has to be trained. Thus, before starting to trade, some
data must be used for training. This data may be called out of
TABLE I. MICROSECOND DATA EXAMPLE FOR NGF6 CONTRACT sample data. All data of microsecond futures commodity
Receiving Receiving Time Symbo Asse Entr Entr
contracts had to be divided into training and testing datasets. The
Date l t y y method of dividing data into training and testing periods was
Type Price referred to as the holdout method in statistical classification. [26]
20150809 17:00:00.869053009 NGF6 NG A 3227 When selecting training or out of sample period, it is important
20150809 17:00:00.869053009 NGF6 NG B 3221 to select the right size of this window: if too big window is
chosen, strategy may overtrain and it cannot be too small as the
20150809 17:00:00.930168164 NGF6 NG A 3226 strategy will not be able to notice the abnormal behaviour. [30]
121
Finally, the testing period follows immediately after the training of pairs. To test for cointegration we adopted Engle and Granger
period. 2-step approach and Johansen test. This methodology is based
on Caldeira and Moura. [12]
A. Data Normalization
Johansen test determines the number of cointegrating
Upon receiving the microsecond data for commodity futures relations and also implements a multivariate extension of the 2-
contracts, next step was to normalize these data to be able to step Engle and Granger procedure. [12]
implement them in our test environment. First task was to bring
time stamp data together. For example, if we have a time stamp All of the procedures are implemented on MATLAB. The
of 17:00:00.869053009 in one contract and the time stamp of second part of the algorithm creates trading signals for the
17:00:00.825207610 in other futures contract, these two time detected cointegrating relations based on the predefined
stamps have to appear in both contracts. In our case, all different investment decision rules.
time stamps had to appear in all five different futures contacts.
V. EXPERIMENTAL SETUP
If the contract is filled with a new time stamp, the price for
that futures contract is set the same as the last time stamp. It is The two main criteria for algorithmic trading are speed – that
assumed that the price did not change for that time. In this way, is the speed with which the same set of computations can be
all time stamps of futures contracts are normalized for performed on multiple sets of data – and programmability. For
nanosecond and microsecond data. [24][32] this principle, general-purpose hardware – such as Intel Central
Processing Unit (CPU) – is not suitable. The CPU is designed to
As all time stamps for all the futures contracts were obtained, execute commands in a linear fashion, however, the task at hand
it was time to define data out of sample, normalization and will benefit most from parallelization as the same calculations
trading periods. During this procedure, all parameter were kept are required to be performed on multiple data; this is where
the same: out of sample period was 5 minutes, normalization and parallelization and hardware acceleration come into play.
trading period was kept the same, i.e., 20 seconds for each
trading window. One more period was selected, which is for During our research CPU used was Intel i5 - 3230M 2,6 GHz
closing the positions, which was 20 seconds as well. with two cores (2 MATLAB worker) and GPU GeForce 710M
with 96 CUDA cores. Firstly we did apply the pair trading
Upon setting and defining the above parameters on the strategy only two CPU. Using “parfor” function of MATLAB
trading strategy, price normalization follows. When normalizing which allows hundreds of operations to be performed in parallel
for each price of futures commodity contract P(i,t), we calculate with CPU we did detect calculations that were possible to
empirical mean µ(i,t) and standard deviation σ(i,t) for the parallelize. During this stage we did speed up the strategy to
selected normalization period, and then apply the following maximize its performance by using only CPU.
equation [30]:
When it came to GPU we did use gpuArray and arrayfun
𝑃(𝑖,𝑡)−𝜇(𝑖,𝑡)
𝑝(𝑖, 𝑡) = (2) GPU functions together with parfor, which works on CPU.
𝜎(𝑖,𝑡)
GpuArray creates array on GPU and arrayfun applys function to
Value p(i,t) is the normalized price of futures commodity each element of array. This method of using gpuArray with
contract i at time t. [30] arrayfun makes actual evaluation of the function happens on the
GPU, not on the CPU. Thus, any required data not already on
B. Pair Selection the GPU is moved to GPU memory, the MATLAB function
One of two main parts of this trading methodology is the passed in for evaluation is compiled for the GPU, and then
pairs selection algorithm which is essentially based on executed on the GPU. All the output arguments return as
cointegration testing. Cointegration method involves the gpuArray objects. [10][11]
following steps: In our experiment we did parallelize pair detection, detecting
1. Identify futures contract pairs that could potentially be buy/sell signals, the trading and profit calculation. It was
cointegrated; possible to parallelize these functions because every iteration the
strategy has it must perform same calculations. In order not to
2. Once the potential pairs are identified, we need to verify wait for one function to stop we can perform multiple
the proposed hypothesis that the futures contract pairs are indeed calculations with multiple functions.
cointegrated based on the information from historical data;
VI. EXPERIMENTAL RESULTS
3. Examine the cointegrated pairs to determine whether they
can be trade on. [33] The overall pair trading strategy performance was measured
in the profit it did generate. During the experiment we did not
The objective of this phase is to identify the pairs with linear
use transactions cost, which was kept zero, and the amount
combination exhibiting a significant predictable component that
invested in each trade was kept the same, which was 10. The
is uncorrelated with underlying movements in the market as a
profit/loss was measured in percentage in change of overall
whole. With this aim, we first measure the spread of pair prices
difference at the end of each trading day. A more detailed
for stationarity. In this research, it is done by checking whether
information is presented in figure below.
the data series are integrated in the same order by using
Augmented Dickey Fuller Test (ADF), which is the extended
version Dickey Fuller. [12] Having passed the ADF test,
cointegration tests are performed on all possible combinations
122
2015-08-26 3187,60 2600,40 5119660
2015-08-27 5004,90 4244,20 7963320
2015-08-28 5287,10 4413,10 7721975
2015-08-31 5409,70 4594,10 8613445
From table 2 it is shown how much time in seconds did
algorithm spend on each day trading simulation using different
hardware CPU (Intel i5 - 3230M 2,6 GHz,2 cores) and GPU
(GeForce 710m, 96 CUDA Cores) and how many records it had
to process.
The more detailed information is presented in figure below
where the speedup difference in percentage is shown.
Fig. 2. Strategy performance for each day by the profit it did generate
Figure 2 above shows the daily profits from HFT trading
algorithm and confirms the results revealed by High Frequency
Trading market leader Virtu Financial, Inc, where only one
losing trading day out of 1237 days was generated [14]. The
chart in Figure 1 illustrates daily results of an algorithm-based
on a statistical arbitrage HFT system. The less profitable days
occur because of fewer trades, due to less trade signals, rather
than fluctuations or a series of unproductive trades. However, Fig. 3. The improvement of the algorithm when using GPU
our research aim was not to measure the profit of the strategy but
to improve the speed of algorithm by using GPU. The same pair As shown in figure above when pair trading algorithm was
trading strategy was applied to CPU and later to CPU working presented to GPU, the speed of simulation did improve
together with GPU. In the table below we can see the amount of dramatically varying from 12% to 36% improve in overall
records pairs trading algorithm had to process and how much speed. The difference of speed for different days occurs due to
time did it take using CPU and GPU. different number of trades made and different number of trade
signals. The more parameters are possible to make parallel and
TABLE II. CPU AND GPU COMPARISON move to GPU, the bigger speedup is possible to achieve. It is
shown that CPU, even with multi-threaded implementation, is
Date Intel i5 - 3230M GeForce 710m, 96 Number of
2,6 GHz,2 cores CUDA Cores (in records not a feasible option for large dense matrices. For the GPU
(in seconds) seconds) processed implementation, performance impact of the global memory
access patterns on the GPU board and the memory coalescing
2015-08-03 2991,80 2081,60 6096505
are emphasized. In our case the bigger the matrix of trades and
2015-08-04 2208,10 1400,50 4579465 pairs the more measurable is the speed up by GPU. The results
2015-08-05 2393,70 1783,10 5793525 show the importance of technical advantages in HFT and how
important is to improve the algorithm in order to use the most of
2015-08-06 3040,90 2585,3 5595770 the hardware it is presented to. In our research the possibility to
2015-08-07 2650,10 2027,1 5586360 improve the speed of daily trading with microseconds came,
when algorithms calculations were parallelized and presented to
2015-08-10 4410,80 3080,70 5732355
GPU using gpuArrays and arrayfun in MATLAB, that allows to
2015-08-11 4980,30 3154,50 6249980 exploit the GPU at hand.
2015-08-12 2769,20 2151,20 6758875 VII. CONCLUSIONS
2015-08-13 4122,60 3419,00 5666900 Recent technological advances have made trading in the
2015-08-14 1325,90 1055,80 4227335 markets fast and mostly done by computers and algorithms.
Instead of humans, computers replicate the role of market
2015-08-17 1550,00 1171,10 4879990
makers, specialists or liquidity providers but at a much higher
2015-08-18 1912,10 1299,50 4364540 rate of speed. The number of derived financial instruments has
2015-08-19 4002,30 3278,70 5666700 caused increased opportunities for profits arising from pricing
inefficiencies or price move delays between securities. Trading
2015-08-20 4449,00 3119,43 5411145 algorithms now work not only with CPU, but with GPU. These
2015-08-21 4311,70 3389,10 5946205 factors have been driving forces to test the system based on pair
trading in HFT and see how the effectiveness differ when using
2015-08-24 4809,40 4064,00 7710745
different hardware. In this paper, high frequency algorithmic
2015-08-25 3960,20 3466,10 5105175 pairs trading was developed on the market - neutral statistical
arbitrage strategy presented by D. Herlemont. Importantly, all
123
five futures commodity contracts, used for the proposed pairs [10] Matlab. (2016), se.mathworks.com. [ONLINE] Available
trading strategy, belong to same CME group, which is the at: https://se.mathworks.com/help/distcomp/gpu-computing.html.
world's largest options and futures exchange platform. proposed [11] Matlab. (2015), se.mathworks.com. [ONLINE] Available
at: https://se.mathworks.com/discovery/matlab-gpu.html.
trading strategy used the pairs selection algorithm which
[12] Caldeira J. F., Moura G. V. (2013), “Selection of a portfolio of pairs based
consisted of the Augmented Dickey Fuller test. If futures on cointegration: A statistical arbitrage strategy”, Revista Brasileira de
commodity contracts prices pass the Augmented Dickey Fuller Financas, Vol. 11(1), pp. 49–80.
test, cointegration tests are performed on all possible [13] Bogoev D., Karam A. (2016), “An Empirical detection of High Frequency
combination of pairs. To test for cointegration Engle and Trading Strategies”, 6th International Conference of the Financial
Grangers 2-step approach and Johansen test was adopted. Engineering and Banking Society. June 10-12, 2016 Melaga.
Trading strategy was firstly presented to CPU (Intel i5 - 3230M [14] Cifu D. A. (2014), “FORM S-1, Registration Statement Under The
2,6 GHz1 2 cores) and later to GPU (GeForce 710m, 96 CUDS Securities Act Of 1933”, Virtu Financial, Inc.
cores). All trading parameters were kept the same during [15] Dickey D., Fuller W. (1979), “Distribution of the Estimator for
research. The purpose of this was to measure the effectiveness Autoregressive Time series with a Unit Root”, Journal of the American
Statistical Association, Vol. 74, pp. 427-431.
of hardware and to check how much higher frequency trading
[16] Driaunys K., Masteika S., Sakalauksas V., Vaitonis M. (2014), “An
evolution and performance improves when it is presented to algorithm-based statistical arbitrage high frequency trading system to
GPU rather than to only CPU. At the end of the research, when forecast prices of natural gas futures”, Transformations in business and
all datasets were implemented to the pairs selection algorithm economics. Vol. 13(3), p. 96–109.
working with CPU and GPU, the results were gathered. It should [17] Engle, R. F., Granger, C. W. J. (1987), “Co-integration and error
be no surprise that when algorithm was presented to GPU it did correction: Representation, estimation, and testing”, Econometrica, Vol.
perform more effective. The speed up of daily improvement of 55(2), pp. 251–276.
speed did vary from 12% to 36%. The difference of speed for [18] Fox M. B., Glosten L. R., Rauterberg G. V. (2015), “The New Stock
Market: Sense and Nonsense” , 65 Duke L.J. 191.
different days occurs due to different number of trades made and
different number of trade signals. The more parameters are [19] Herlemont D. (2013), “Pairs Trading, Convergence Trading,
Cointegration”, Quantitative Finance, Vol. 12(9).
possible to make parallel and move to GPU, the bigger speedup
[20] Kaya O. (2016), “High – frequency trading. Reaching the limits”,
is possible to achieve. The increase could be even more dramatic Automated trader magazine. Vol. 41, p. 23 – 27.
if algorithm would be presented to even more financial [21] Kirchner S. (2015), “High frequency trading: Fact and fiction”, Policy: A
instruments and more trading signals would be created. Journal of Public Policy and Ideas, Vol. 31(4), pp. 8-20..
[22] Lau C. A., Xie W., Wu Y. (2016), “Multi – Dimensional Pairs Trading
ACKNOWLEDGMENT Using Copulas”, European Financial Management Association 2016
We would also like to show our gratitude to the NANOTICK Annual Meetings June 29-July 2, 2016 Basel, Switzerland.
for providing high frequency data in microseconds of 5 [23] Madhavaram G. R. (2013), “Statistical Arbitrage Using Pairs Trading
commodity futures contracts. With Support Vector Machine Learning”, Saint Mary's University.
[24] Masteika S., Vaitonis M. (2015), “Quantitative Research in High
REFERENCES Frequency Trading for Natural Gas Futures Market”, Business
Information Systems Workshops, Vol. 228, pp. 29-35.
[1] Ahmed M., Chai A., Ding X., Jiang Y., Sun Y. (2009), “Statistical
[25] Miao G. J. (2014), “High Frequency and Dynamic Pairs Trading Based
Arbitrage in High Frequency Trading Based on Limit Order Book
on Statistical Arbitrage Using a Two-Stage Correlation and Cointegration
Dynamics”.
Approach”, International Journal of Economics and Finance, Vol. 6(3),
[2] Danelutto M., De Matteis T., Mencagli G., Torquati M. (2015), pp. 96 – 110.
“Parallelizing High-Frequency Trading Applications by Using C++11
[26] Miao G. J., Clements M. A. (2002), “Digital Signal Processing and
Attributes”, August 2015, IEEE.
Statistical Classification”, Artech House, ISBN 1580531350.
[3] Mustafa U. Torun, Onur Yılmaz, Ali N. Akansu. (2016), “FPGA, GPU,
[27] Miller R. S., Shorter G. (2016), “High Frequency Trading: Overview of
and CPU implementations of Jacobi algorithm for eigenanalysis”, Journal
Recent Developments”, report, April 4, 2016; Washington D.C
of Parallel and Distributed Computing, Vol. 96, pp 172-180.
[28] Mushtaq R. (2011), “Augmented Dickey Fuller Test”. Available at SSRN:
[4] Kozikowski G., Papamanousakis G., Yang J. (2015), “Potential future
https://ssrn.com/abstract=1911068.
exposure, modelling and accelerating on GPU and FPGA”, WHPCF 2015
Proceedings of the 8th Workshop on High Performance Computational [29] Ohara M. (2015), “High frequency market microstructure”, Journal of
Finance, Article No. 4. Financial Economics, Vol. 116(2), pp. 257–270.
[5] Liang Y., Xing, X., Li Y.(2017), “A GPU-based large-scale Monte Carlo [30] Perlin M. S. (2009), “Evaluation of Pairs-trading strategy at the Brazilian
simulation method for systems with long-range interactions”, Journal of financial market”, Journal of Derivatives & Hedge Funds, Vol. 15(2), pp.
Computational Physics, Vol/ 338, pp. 252-268 . 122–136.
[6] Preis T. (2011), “GPU – computing in econophysics and statistical [31] Vaitonis M. (2017), “Pairs Trading Using HFT in OMX Baltic Market”,
physics”, The European Physical Journal Special Topics, Vol. 194, pp. Baltic J. Modern Computing, Vol. 5(1), pp. 37-49.
87 – 119. [32] Vaitonis M., Masteika S. (2016), “Research in High Frequency Trading
[7] Margara A., Cugola G. (2011), “High performance content-based and Pairs Selection Algorithm with Baltic Region Stocks”, In: Dregvaite
matching using GPUs”, Proceedings of the 5th ACM international G., Damasevicius R. Information and Software Technologies. ICIST
conference on Distributed event-based system, New York, USA 2016. Communications in Computer and Information Science, Vol 639.
Springer.
[8] NVIDIA Corporation. (2008) NVIDIA CUDA Compute Unified Device
Architecture, [33] Vidyamurthy G. (2004), “Pairs Trading – Quantitative Methods and
Analysis, New Jersey”, John Wiley & Sons, Inc., p.210.
[9] Napoli C. et al., A cloud-distributed GPU architecture for pattern
identification in segmented detectors big-data surveys. The Computer [34] Zubulake P., Lee S. (2011), “The High frequency game changer: how
Journal, vol. 59, issue 3 , pp.338-352. automated trading strategies have revolutionized the markets”, Aite
group. Wiley trading.
124