ترغب بنشر مسار تعليمي؟ اضغط هنا

Universal Trading for Order Execution with Oracle Policy Distillation

106   0   0.0 ( 0 )
 نشر من قبل Kan Ren
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

As a fundamental problem in algorithmic trading, order execution aims at fulfilling a specific trading order, either liquidation or acquirement, for a given instrument. Towards effective execution strategy, recent years have witnessed the shift from the analytical view with model-based market assumptions to model-free perspective, i.e., reinforcement learning, due to its nature of sequential decision optimization. However, the noisy and yet imperfect market information that can be leveraged by the policy has made it quite challenging to build up sample efficient reinforcement learning methods to achieve effective order execution. In this paper, we propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. Particularly, this framework leverages a policy distillation method that can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information to approximate the optimal trading strategy. The extensive experiments have shown significant improvements of our method over various strong baselines, with reasonable trading actions.



قيم البحث

اقرأ أيضاً

Optimal trade execution is an important problem faced by essentially all traders. Much research into optimal execution uses stringent model assumptions and applies continuous time stochastic control to solve them. Here, we instead take a model free a pproach and develop a variation of Deep Q-Learning to estimate the optimal actions of a trader. The model is a fully connected Neural Network trained using Experience Replay and Double DQN with input features given by the current state of the limit order book, other trading signals, and available execution actions, while the output is the Q-value function estimating the future rewards under an arbitrary action. We apply our model to nine different stocks and find that it outperforms the standard benchmark approach on most stocks using the measures of (i) mean and median out-performance, (ii) probability of out-performance, and (iii) gain-loss ratios.
We consider optimal execution strategies for block market orders placed in a limit order book (LOB). We build on the resilience model proposed by Obizhaeva and Wang (2005) but allow for a general shape of the LOB defined via a given density function. Thus, we can allow for empirically observed LOB shapes and obtain a nonlinear price impact of market orders. We distinguish two possibilities for modeling the resilience of the LOB after a large market order: the exponential recovery of the number of limit orders, i.e., of the volume of the LOB, or the exponential recovery of the bid-ask spread. We consider both of these resilience modes and, in each case, derive explicit optimal execution strategies in discrete time. Applying our results to a block-shaped LOB, we obtain a new closed-form representation for the optimal strategy, which explicitly solves the recursive scheme given in Obizhaeva and Wang (2005). We also provide some evidence for the robustness of optimal strategies with respect to the choice of the shape function and the resilience-type.
It is a difficult task for both professional investors and individual traders continuously making profit in stock market. With the development of computer science and deep reinforcement learning, Buy&Hold (B&H) has been oversteped by many artificial intelligence trading algorithms. However, the information and process are not enough, which limit the performance of reinforcement learning algorithms. Thus, we propose a parallel-network continuous quantitative trading model with GARCH and PPO to enrich the basical deep reinforcement learning model, where the deep learning parallel network layers deal with 3 different frequencies data (including GARCH information) and proximal policy optimization (PPO) algorithm interacts actions and rewards with stock trading environment. Experiments in 5 stocks from Chinese stock market show our method achieves more extra profit comparing with basical reinforcement learning methods and bench models.
We demonstrate an application of risk-sensitive reinforcement learning to optimizing execution in limit order book markets. We represent taking order execution decisions based on limit order book knowledge by a Markov Decision Process; and train a tr ading agent in a market simulator, which emulates multi-agent interaction by synthesizing market response to our agents execution decisions from historical data. Due to market impact, executing high volume orders can incur significant cost. We learn trading signals from market microstructure in presence of simulated market response and derive explainable decision-tree-based execution policies using risk-sensitive Q-learning to minimize execution cost subject to constraints on cost variance.
As deep reinforcement learning (DRL) has been recognized as an effective approach in quantitative finance, getting hands-on experiences is attractive to beginners. However, to train a practical DRL trading agent that decides where to trade, at what p rice, and what quantity involves error-prone and arduous development and debugging. In this paper, we introduce a DRL library FinRL that facilitates beginners to expose themselves to quantitative finance and to develop their own stock trading strategies. Along with easily-reproducible tutorials, FinRL library allows users to streamline their own developments and to compare with existing schemes easily. Within FinRL, virtual environments are configured with stock market datasets, trading agents are trained with neural networks, and extensive backtesting is analyzed via trading performance. Moreover, it incorporates important trading constraints such as transaction cost, market liquidity and the investors degree of risk-aversion. FinRL is featured with completeness, hands-on tutorial and reproducibility that favors beginners: (i) at multiple levels of time granularity, FinRL simulates trading environments across various stock markets, including NASDAQ-100, DJIA, S&P 500, HSI, SSE 50, and CSI 300; (ii) organized in a layered architecture with modular structure, FinRL provides fine-tuned state-of-the-art DRL algorithms (DQN, DDPG, PPO, SAC, A2C, TD3, etc.), commonly-used reward functions and standard evaluation baselines to alleviate the debugging workloads and promote the reproducibility, and (iii) being highly extendable, FinRL reserves a complete set of user-import interfaces. Furthermore, we incorporated three application demonstrations, namely single stock trading, multiple stock trading, and portfolio allocation. The FinRL library will be available on Github at link https://github.com/AI4Finance-LLC/FinRL-Library.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا