No Arabic abstract
Mid-price movement prediction based on limit order book (LOB) data is a challenging task due to the complexity and dynamics of the LOB. So far, there have been very limited attempts for extracting relevant features based on LOB data. In this paper, we address this problem by designing a new set of handcrafted features and performing an extensive experimental evaluation on both liquid and illiquid stocks. More specifically, we implement a new set of econometrical features that capture statistical properties of the underlying securities for the task of mid-price prediction. Moreover, we develop a new experimental protocol for online learning that treats the task as a multi-objective optimization problem and predicts i) the direction of the next price movement and ii) the number of order book events that occur until the change takes place. In order to predict the mid-price movement, the features are fed into nine different deep learning models based on multi-layer perceptrons (MLP), convolutional neural networks (CNN) and long short-term memory (LSTM) neural networks. The performance of the proposed method is then evaluated on liquid and illiquid stocks, which are based on TotalView-ITCH US and Nordic stocks, respectively. For some stocks, results suggest that the correct choice of a feature set and a model can lead to the successful prediction of how long it takes to have a stock price movement.
Stock price prediction is a challenging task, but machine learning methods have recently been used successfully for this purpose. In this paper, we extract over 270 hand-crafted features (factors) inspired by technical and quantitative analysis and tested their validity on short-term mid-price movement prediction. We focus on a wrapper feature selection method using entropy, least-mean squares, and linear discriminant analysis. We also build a new quantitative feature based on adaptive logistic regression for online learning, which is constantly selected first among the majority of the proposed feature selection methods. This study examines the best combination of features using high frequency limit order book data from Nasdaq Nordic. Our results suggest that sorting methods and classifiers can be used in such a way that one can reach the best performance with a combination of only very few advanced hand-crafted features.
We have proposed to develop a global hybrid deep learning framework to predict the daily prices in the stock market. With representation learning, we derived an embedding called Stock2Vec, which gives us insight for the relationship among different stocks, while the temporal convolutional layers are used for automatically capturing effective temporal patterns both within and across series. Evaluated on S&P 500, our hybrid framework integrates both advantages and achieves better performance on the stock price prediction task than several popular benchmarked models.
Bitcoin, as one of the most popular cryptocurrency, is recently attracting much attention of investors. Bitcoin price prediction task is consequently a rising academic topic for providing valuable insights and suggestions. Existing bitcoin prediction works mostly base on trivial feature engineering, that manually designs features or factors from multiple areas, including Bticoin Blockchain information, finance and social media sentiments. The feature engineering not only requires much human effort, but the effectiveness of the intuitively designed features can not be guaranteed. In this paper, we aim to mining the abundant patterns encoded in bitcoin transactions, and propose k-order transaction graph to reveal patterns under different scope. We propose the transaction graph based feature to automatically encode the patterns. A novel prediction method is proposed to accept the features and make price prediction, which can take advantage from particular patterns from different history period. The results of comparison experiments demonstrate that the proposed method outperforms the most recent state-of-art methods.
The majority of studies in the field of AI guided financial trading focus on purely applying machine learning algorithms to continuous historical price and technical analysis data. However, due to non-stationary and high volatile nature of Forex market most algorithms fail when put into real practice. We developed novel event-driven features which indicate a change of trend in direction. We then build long deep learning models to predict a retracement point providing a perfect entry point to gain maximum profit. We use a simple recurrent neural network (RNN) as our baseline model and compared with short-term memory (LSTM), bidirectional long short-term memory (BiLSTM) and gated recurrent unit (GRU). Our experiment results show that the proposed event-driven feature selection together with the proposed models can form a robust prediction system which supports accurate trading strategies with minimal risk. Our best model on 15-minutes interval data for the EUR/GBP currency achieved RME 0.006x10^(-3) , RMSE 2.407x10^(-3), MAE 1.708x10^(-3), MAPE 0.194% outperforming previous studies.
This paper presents a deep learning framework based on Long Short-term Memory Network(LSTM) that predicts price movement of cryptocurrencies from trade-by-trade data. The main focus of this study is on predicting short-term price changes in a fixed time horizon from a looking back period. By carefully designing features and detailed searching for best hyper-parameters, the model is trained to achieve high performance on nearly a year of trade-by-trade data. The optimal model delivers stable high performance(over 60% accuracy) on out-of-sample test periods. In a realistic trading simulation setting, the prediction made by the model could be easily monetized. Moreover, this study shows that the LSTM model could extract universal features from trade-by-trade data, as the learned parameters well maintain their high performance on other cryptocurrency instruments that were not included in training data. This study exceeds existing researches in term of the scale and precision of data used, as well as the high prediction accuracy achieved.