Intelligent Inverse Treatment Planning via Deep Reinforcement Learning, a Proof-of-Principle Study in High Dose-rate Brachytherapy for Cervical Cancer

147 0 0.0 ( 0 )

Download Cite

Added by Chenyang Shen

Publication date 2018

fields Physics

and research's language is English

Authors Chenyang Shen - Yesenia Gonzalez - Peter Klages

Medical Physics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Inverse treatment planning in radiation therapy is formulated as optimization problems. The objective function and constraints consist of multiple terms designed for different clinical and practical considerations. Weighting factors of these terms are needed to define the optimization problem. While a treatment planning system can solve the optimization problem with given weights, adjusting the weights for high plan quality is performed by human. The weight tuning task is labor intensive, time consuming, and it critically affects the final plan quality. An automatic weight-tuning approach is strongly desired. The weight tuning procedure is essentially a decision making problem. Motivated by the tremendous success in deep learning for decision making with human-level intelligence, we propose a novel framework to tune the weights in a human-like manner. Using treatment planning in high-dose-rate brachytherapy as an example, we develop a weight tuning policy network (WTPN) that observes dose volume histograms of a plan and outputs an action to adjust organ weights, similar to the behaviors of a human planner. We train the WTPN via end-to-end deep reinforcement learning. Experience replay is performed with the epsilon greedy algorithm. Then we apply the trained WTPN to guide treatment planning of testing patient cases. The trained WTPN successfully learns the treatment planning goals to guide the weight tuning process. On average, the quality score of plans generated under the WTPNs guidance is improved by ~8.5% compared to the initial plan with arbitrary weights, and by 10.7% compared to the plans generated by human planners. To our knowledge, this is the first tool to adjust weights for the treatment planning in a human-like fashion based on learnt intelligence. The study demonstrates potential feasibility to develop intelligent treatment planning system via deep reinforcement learning.

rate research

Interactive Treatment Planning in High Dose-Rate Brachytherapy for Gynecological Cancer

132 - Huan Liu , Chang M Ma , Xun Jia 2021

High dose-rate brachytherapy (HDRBT) is widely used for gynecological cancer treatment. Although commercial treatment planning systems (TPSs) have inverse optimization modules, it takes several iterations to adjust planning objectives to achieve a satisfactory plan. Interactive plan-modification modules enable modifying the plan and visualizing results in real time, but they update plans based on simple geometrical or heuristic algorithms, which cannot ensure resulting plan optimality. This project develops an interactive plan optimization module for HDRBT of gynecological cancer. By efficiently solving an optimization problem in real time, it allows a user to visualize a plan and interactively modify it to improve quality. We formulated an optimization problem with an objective function containing a weighted sum of doses to normal organs subject to user-specified target coverage. A user interface was developed that allows a user to adjust organ weights using scroll bars. With a simple mouse click, the optimization problem is solved in seconds with a highly efficient alternating-direction method of multipliers and a warm start optimization strategy. Resulting clinically relevant D2cc of organs are displayed immediately. This allows a user to intuitively adjust plans with satisfactory quality. We tested the effectiveness of our development in cervix cancer cases treated with a tandem-and-ovoid applicator. It took a maximum of 3 seconds to solve the optimization problem in each instance. With interactive optimization capability, a satisfactory plan can be obtained in <1 min. In our clinic, although the time for plan adjustment was typically <10min with simple interactive plan modification tools in TPS, the resulting plans do not ensure optimality. Our plans achieved on average 5% lower D2cc than clinical plans, while maintaining target coverage.

Medical Physics

Automatic Inverse Treatment Planning for Gamma Knife Radiosurgery via Deep Reinforcement Learning

127 - Yingzi Liu , Chenyang Shen , Tonghe Wang 2021

Purpose: Several inverse planning algorithms have been developed for Gamma Knife (GK) radiosurgery to determine a large number of plan parameters via solving an optimization problem, which typically consists of multiple objectives. The priorities among these objectives need to be repetitively adjusted to achieve a clinically good plan for each patient. This study aimed to achieve automatic and intelligent priority-tuning, by developing a deep reinforcement learning (DRL) based method to model the tuning behaviors of human planners. Methods: We built a priority-tuning policy network using deep convolutional neural networks. Its input was a vector composed of the plan metrics that were used in our institution for GK plan evaluation. The network can determine which tuning action to take, based on the observed quality of the intermediate plan. We trained the network using an end-to-end DRL framework to approximate the optimal action-value function. A scoring function was designed to measure the plan quality. Results: Vestibular schwannoma was chosen as the test bed in this study. The number of training, validation and testing cases were 5, 5, and 16, respectively. For these three datasets, the average plan scores with initial priorities were 3.63 $pm$ 1.34, 3.83 $pm$ 0.86 and 4.20 $pm$ 0.78, respectively, while can be improved to 5.28 $pm$ 0.23, 4.97 $pm$ 0.44 and 5.22 $pm$ 0.26 through manual priority tuning by human expert planners. Our network achieved competitive results with 5.42 $pm$ 0.11, 5.10 $pm$ 0. 42, 5.28 $pm$ 0.20, respectively. Conclusions: Our network can generate GK plans of comparable or slightly higher quality comparing with the plans generated by human planners via manual priority tuning. The network can potentially be incorporated into the clinical workflow to improve GK planning efficiency.

Medical Physics

Improving Efficiency of Training a Virtual Treatment Planner Network via Knowledge-guided Deep Reinforcement Learning for Intelligent Automatic Treatment Planning of Radiotherapy

153 - Chenyang Shen , Liyuan Chen , Yesenia Gonzalez 2020

We previously proposed an intelligent automatic treatment planning framework for radiotherapy, in which a virtual treatment planner network (VTPN) was built using deep reinforcement learning (DRL) to operate a treatment planning system (TPS). Despite the success, the training of VTPN via DRL was time consuming. Also the training time is expected to grow with the complexity of the treatment planning problem, preventing the development of VTPN for more complicated but clinically relevant scenarios. In this study we proposed a knowledge-guided DRL (KgDRL) that incorporated knowledge from human planners to guide the training process to improve the training efficiency. Using prostate cancer intensity modulated radiation therapy as a testbed, we first summarized a number of rules of operating our in-house TPS. In training, in addition to randomly navigating the state-action space, as in the DRL using the epsilon-greedy algorithm, we also sampled actions defined by the rules. The priority of sampling actions from rules decreased over the training process to encourage VTPN to explore new policy that was not covered by the rules. We trained a VTPN using KgDRL and compared its performance with another VTPN trained using DRL. Both VTPNs trained via KgDRL and DRL spontaneously learned to operate the TPS to generate high-quality plans, achieving plan quality scores of 8.82 and 8.43, respectively. Both VTPNs outperformed treatment planning purely based on the rules, which had a plan score of 7.81. VTPN trained with 8 episodes using KgDRL was able to perform similarly to that trained using DRL with 100 episodes. The training time was reduced from more than a week to 13 hours. The proposed KgDRL framework accelerated the training process by incorporating human knowledge, which will facilitate the development of VTPN for more complicated treatment planning scenarios.

Medical Physics

On EM Reconstruction of a Multi Channel Shielded Applicator for Cervical Cancer Brachytherapy: A Feasibility Study

366 - D. Tho , E. Racine (1 2017

Electromagnetic tracking (EMT) is a promising technology for automated catheter and applicator reconstruc- 10 tions in brachytherapy. In this work, a proof-of-concept is presented for reconstruction of the individual channels of a shielded tandem applicator dedicated to intensity modulated brachytherapy. All six channels of a straight prototype was reconstructed and the distance between two opposite channels was measured. A study was also conducted on the influence of the shield on the data fluctuation of the EMT system. The differences with the CAD specified dimensions are under 2 mm. The pair of channels which has one of it more distant from the generator have 15 higher inter-channel distance with higher variability. In the first 110 cm reconstruction, all inter-channel distances are within the geometrical tolerances. According to a paired Student t-test, the data given by the EM system with and without the shield applicator tip are not significantly different. This study shows that the reconstruction of channel path within the mechanical accuracy of the applicator is possible.

Medical Physics

Optimization of a multipoint plastic scintillator dosimeter for high dose rate brachytherapy

69 - Haydee M. Linares Rosales , Patricia Duguay-Drouin , Louis Archambault 2018

Purpose: This study aims to optimize and characterize the response of a mPSD for in vivo dosimetry in HDR brachytherapy. Methods: An exhaustive analysis was carried out in order to obtain an optimized mPSD design that maximize the scintillation light collection produced by the interaction of ionizing photons. Several mPSD prototypes were built and tested in order to determine the appropriate order of scintillators relative to the photodetector, as well as their length as a function of the scintillation light emitted. Scintillators BCF-60, BCF-12 and BCF-10 constituted the mPSD sensitive volume.Each scintillator contribution to the total spectrum was determined by irradiations in the low energy range.For the best mPSD design, a numerical optimization was done in order to select the optical components that better match the light emission profile. The optimized dosimetric system was used for HDR brachytherapy dose determination. The system performance was quantified in term of signal to noise ratio and signal to background ratio. Results: It was determined that BCF-60 should be placed at the distal position, BCF-12 in the center and BCF-10 at proximal position with respect to the photodetector.This configuration allowed for optimized light transmission through the collecting fiber, avoiding inter-scintillator excitation and self-absorption effects.The optimized luminescence system allowed for signal deconvolution using a multispectral approach, extracting the dose to each element while taking into account Cerenkov stem effect.Differences between the mPSD measurements and TG-43 remain below 5%. In all measurement conditions, the system was able to properly differentiate the produced scintillation signal from the background one. Conclusions: A mPSD was constructed and optimized for HDR brachytherapy dosimetry, enabling real time dose determination, up to 6.5cm from the 192Ir source.

Medical Physics