No Arabic abstract
Machine learning (especially reinforcement learning) methods for trading are increasingly reliant on simulation for agent training and testing. Furthermore, simulation is important for validation of hand-coded trading strategies and for testing hypotheses about market structure. A challenge, however, concerns the robustness of policies validated in simulation because the simulations lack fidelity. In fact, researchers have shown that many market simulation approaches fail to reproduce statistics and stylized facts seen in real markets. As a step towards addressing this we surveyed the literature to collect a set of reference metrics and applied them to real market data and simulation output. Our paper provides a comprehensive catalog of these metrics including mathematical formulations where appropriate. Our results show that there are still significant discrepancies between simulated markets and real ones. However, this work serves as a benchmark against which we can measure future improvement.
In order-driven markets, limit-order book (LOB) resiliency is an important microscopic indicator of market quality when the order book is hit by a liquidity shock and plays an essential role in the design of optimal submission strategies of large orders. However, the evolutionary behavior of LOB resilience around liquidity shocks is not well understood empirically. Using order flow data sets of Chinese stocks, we quantify and compare the LOB dynamics characterized by the bid-ask spread, the LOB depth and the order intensity surrounding effective market orders with different aggressiveness. We find that traders are more likely to submit effective market orders when the spreads are relatively low, the same-side depth is high, and the opposite-side depth is low. Such phenomenon is especially significant when the initial spread is 1 tick. Although the resiliency patterns show obvious diversity after different types of market orders, the spread and depth can return to the sample average within 20 best limit updates. The price resiliency behavior is dominant after aggressive market orders, while the price continuation behavior is dominant after less-aggressive market orders. Moreover, the effective market orders produce asymmetrical stimulus to limit orders when the initial spreads equal to 1 tick. Under this case, effective buy market orders attract more buy limit orders and effective sell market orders attract more sell limit orders. The resiliency behavior of spread and depth is linked to limit order intensity.
We examine the dynamics of the bid and ask queues of a limit order book and their relationship with the intensity of trade arrivals. In particular, we study the probability of price movements and trade arrivals as a function of the quote imbalance at the top of the limit order book. We propose a stochastic model in an attempt to capture the joint dynamics of the top of the book queues and the trading process, and describe a semi-analytic approach to calculate the relative probability of market events. We calibrate the model using historical market data and discuss the quality of fit and practical applications of the results.
It has been suggested that marked point processes might be good candidates for the modelling of financial high-frequency data. A special class of point processes, Hawkes processes, has been the subject of various investigations in the financial community. In this paper, we propose to enhance a basic zero-intelligence order book simulator with arrival times of limit and market orders following mutually (asymmetrically) exciting Hawkes processes. Modelling is based on empirical observations on time intervals between orders that we verify on several markets (equity, bond futures, index futures). We show that this simple feature enables a much more realistic treatment of the bid-ask spread of the simulated order book.
We introduce a methodology to visualize the limit order book (LOB) using a particle physics lens. Open-source data-analysis tool ROOT, developed by CERN, is used to reconstruct and visualize futures markets. Message-based data is used, rather than snapshots, as it offers numerous visualization advantages. The visualization method can include multiple variables and markets simultaneously and is not necessarily time dependent. Stakeholders can use it to visualize high-velocity data to gain a better understanding of markets or effectively monitor markets. In addition, the method is easily adjustable to user specifications to examine various LOB research topics, thereby complementing existing methods.
This article presents a Hawkes process model with Markovian baseline intensities for high-frequency order book data modeling. We classify intraday order book trading events into a range of categories based on their order types and the price changes after their arrivals. To capture the stimulating effects between multiple types of order book events, we use the multivariate Hawkes process to model the self- and mutually-exciting event arrivals. We also integrate a Markovian baseline intensity into the event arrival dynamic, by including the impacts of order book liquidity state and time factor to the baseline intensity. A regression-based non-parametric estimation procedure is adopted to estimate the model parameters in our Hawkes+Markovian model. To eliminate redundant model parameters, LASSO regularization is incorporated in the estimation procedure. Besides, model selection method based on Akaike Information Criteria is applied to evaluate the effect of each part of the proposed model. An implementation example based on real LOB data is provided. Through the example, we study the empirical shapes of Hawkes excitement functions, the effects of liquidity state as well as time factors, the LASSO variable selection, and the explanatory power of Hawkes and Markovian elements to the dynamics of the order book.