أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Jonathan Yumeng Li

WaveCorr: Correlation-savvy Deep Reinforcement Learning for Portfolio Management

222 - Saeed Marzban , Erick Delage , Jonathan Yumeng Li 2021

The problem of portfolio management represents an important and challenging class of dynamic decision making problems, where rebalancing decisions need to be made over time with the consideration of many factors such as investors preferences, trading environments, and market conditions. In this paper, we present a new portfolio policy network architecture for deep reinforcement learning (DRL)that can exploit more effectively cross-asset dependency information and achieve better performance than state-of-the-art architectures. In particular, we introduce a new property, referred to as textit{asset permutation invariance}, for portfolio policy networks that exploit multi-asset time series data, and design the first portfolio policy network, named WaveCorr, that preserves this invariance property when treating asset correlation information. At the core of our design is an innovative permutation invariant correlation processing layer. An extensive set of experiments are conducted using data from both Canadian (TSX) and American stock markets (S&P 500), and WaveCorr consistently outperforms other architectures with an impressive 3%-25% absolute improvement in terms of average annual return, and up to more than 200% relative improvement in average Sharpe ratio. We also measured an improvement of a factor of up to 5 in the stability of performance under random choices of initial asset ordering and weights. The stability of the network has been found as particularly valuable by our industrial partner.

إدارة المحافظ التعلم الآلي

Deep Reinforcement Learning for Equal Risk Pricing and Hedging under Dynamic Expectile Risk Measures

196 - Saeed Marzban , Erick Delage , Jonathan Yumeng Li 2021

Recently equal risk pricing, a framework for fair derivative pricing, was extended to consider dynamic risk measures. However, all current implementations either employ a static risk measure that violates time consistency, or are based on traditional dynamic programming solution schemes that are impracticable in problems with a large number of underlying assets (due to the curse of dimensionality) or with incomplete asset dynamics information. In this paper, we extend for the first time a famous off-policy deterministic actor-critic deep reinforcement learning (ACRL) algorithm to the problem of solving a risk averse Markov decision process that models risk using a time consistent recursive expectile risk measure. This new ACRL algorithm allows us to identify high quality time consistent hedging policies (and equal risk prices) for options, such as basket options, that cannot be handled using traditional methods, or in context where only historical trajectories of the underlying assets are available. Our numerical experiments, which involve both a simple vanilla option and a more exotic basket option, confirm that the new ACRL algorithm can produce 1) in simple environments, nearly optimal hedging policies, and highly accurate prices, simultaneously for a range of maturities 2) in complex environments, good quality policies and prices using reasonable amount of computing resources; and 3) overall, hedging strategies that actually outperform the strategies produced using static risk measures when the risk is evaluated at later points of time.

تسعير الأوراق المالية التعلم الآلي

Equal Risk Pricing and Hedging of Financial Derivatives with Convex Risk Measures

187 - Saeed Marzban , Erick Delage , Jonathan Yumeng Li 2020

In this paper, we consider the problem of equal risk pricing and hedging in which the fair price of an option is the price that exposes both sides of the contract to the same level of risk. Focusing for the first time on the context where risk is mea sured according to convex risk measures, we establish that the problem reduces to solving independently the writer and the buyers hedging problem with zero initial capital. By further imposing that the risk measures decompose in a way that satisfies a Markovian property, we provide dynamic programming equations that can be used to solve the hedging problems for both the case of European and American options. All of our results are general enough to accommodate situations where the risk is measured according to a worst-case risk measure as is typically done in robust optimization. Our numerical study illustrates the advantages of equal risk pricing over schemes that only account for a single party, pricing based on quadratic hedging (i.e. $epsilon$-arbitrage pricing), or pricing based on a fixed equivalent martingale measure (i.e. Black-Scholes pricing). In particular, the numerical results confirm that when employing an equal risk price both the writer and the buyer end up being exposed to risks that are more similar and on average smaller than what they would experience with the other approaches.

التحسين والتحكم تسعير الأوراق المالية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد