No Arabic abstract
We demonstrate an application of risk-sensitive reinforcement learning to optimizing execution in limit order book markets. We represent taking order execution decisions based on limit order book knowledge by a Markov Decision Process; and train a trading agent in a market simulator, which emulates multi-agent interaction by synthesizing market response to our agents execution decisions from historical data. Due to market impact, executing high volume orders can incur significant cost. We learn trading signals from market microstructure in presence of simulated market response and derive explainable decision-tree-based execution policies using risk-sensitive Q-learning to minimize execution cost subject to constraints on cost variance.
Optimal trade execution is an important problem faced by essentially all traders. Much research into optimal execution uses stringent model assumptions and applies continuous time stochastic control to solve them. Here, we instead take a model free approach and develop a variation of Deep Q-Learning to estimate the optimal actions of a trader. The model is a fully connected Neural Network trained using Experience Replay and Double DQN with input features given by the current state of the limit order book, other trading signals, and available execution actions, while the output is the Q-value function estimating the future rewards under an arbitrary action. We apply our model to nine different stocks and find that it outperforms the standard benchmark approach on most stocks using the measures of (i) mean and median out-performance, (ii) probability of out-performance, and (iii) gain-loss ratios.
As a fundamental problem in algorithmic trading, order execution aims at fulfilling a specific trading order, either liquidation or acquirement, for a given instrument. Towards effective execution strategy, recent years have witnessed the shift from the analytical view with model-based market assumptions to model-free perspective, i.e., reinforcement learning, due to its nature of sequential decision optimization. However, the noisy and yet imperfect market information that can be leveraged by the policy has made it quite challenging to build up sample efficient reinforcement learning methods to achieve effective order execution. In this paper, we propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. Particularly, this framework leverages a policy distillation method that can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information to approximate the optimal trading strategy. The extensive experiments have shown significant improvements of our method over various strong baselines, with reasonable trading actions.
It has been for a long time to use big data of autonomous vehicles for perception, prediction, planning, and control of driving. Naturally, it is increasingly questioned why not using this big data for risk management and actuarial modeling. This article examines the emerging technical difficulties, new ideas, and methods of risk modeling under autonomous driving scenarios. Compared with the traditional risk model, the novel model is more consistent with the real road traffic and driving safety performance. More importantly, it provides technical feasibility for realizing risk assessment and car insurance pricing under a computer simulation environment.
We propose the Hawkes flocking model that assesses systemic risk in high-frequency processes at the two perspectives -- endogeneity and interactivity. We examine the futures markets of WTI crude oil and gasoline for the past decade, and perform a comparative analysis with conditional value-at-risk as a benchmark measure. In terms of high-frequency structure, we derive the empirical findings. The endogenous systemic risk in WTI was significantly higher than that in gasoline, and the level at which gasoline affects WTI was constantly higher than in the opposite case. Moreover, although the relative influences degree was asymmetric, its difference has gradually reduced.
The ultimate value of theories of the fundamental mechanisms comprising the asset price in financial systems will be reflected in the capacity of such theories to understand these systems. Although the models that explain the various states of financial markets offer substantial evidences from the fields of finance, mathematics, and even physics to explain states observed in the real financial markets, previous theories that attempt to fully explain the complexities of financial markets have been inadequate. In this study, we propose an artificial double auction market as an agent-based model approach to study the origin of complex states in the financial markets, characterizing important parameters with an investment strategy that can cover the dynamics of the financial market. The investment strategy of chartist traders after market information arrives should reduce market stability originating in the price fluctuations of risky assets. However, fundamentalist traders strategically submit orders with a fundamental value and, thereby stabilize the market. We construct a continuous double auction market and find that the market is controlled by a fraction of chartists, P_{c}. We show that mimicking real financial markets state, which emerges in real financial systems, is given between approximately P_{c} = 0.40 and P_{c} = 0.85, but that mimicking the efficient market hypothesis state can be generated in a range of less than P_{c} = 0.40. In particular, we observe that the mimicking market collapse state created in a value greater than P_{c} = 0.85, in which a liquidity shortage occurs, and the phase transition behavior is P_{c} = 0.85.