Automatic Inverse Treatment Planning for Gamma Knife Radiosurgery via Deep Reinforcement Learning


Abstract in English

Purpose: Several inverse planning algorithms have been developed for Gamma Knife (GK) radiosurgery to determine a large number of plan parameters via solving an optimization problem, which typically consists of multiple objectives. The priorities among these objectives need to be repetitively adjusted to achieve a clinically good plan for each patient. This study aimed to achieve automatic and intelligent priority-tuning, by developing a deep reinforcement learning (DRL) based method to model the tuning behaviors of human planners. Methods: We built a priority-tuning policy network using deep convolutional neural networks. Its input was a vector composed of the plan metrics that were used in our institution for GK plan evaluation. The network can determine which tuning action to take, based on the observed quality of the intermediate plan. We trained the network using an end-to-end DRL framework to approximate the optimal action-value function. A scoring function was designed to measure the plan quality. Results: Vestibular schwannoma was chosen as the test bed in this study. The number of training, validation and testing cases were 5, 5, and 16, respectively. For these three datasets, the average plan scores with initial priorities were 3.63 $pm$ 1.34, 3.83 $pm$ 0.86 and 4.20 $pm$ 0.78, respectively, while can be improved to 5.28 $pm$ 0.23, 4.97 $pm$ 0.44 and 5.22 $pm$ 0.26 through manual priority tuning by human expert planners. Our network achieved competitive results with 5.42 $pm$ 0.11, 5.10 $pm$ 0. 42, 5.28 $pm$ 0.20, respectively. Conclusions: Our network can generate GK plans of comparable or slightly higher quality comparing with the plans generated by human planners via manual priority tuning. The network can potentially be incorporated into the clinical workflow to improve GK planning efficiency.

Download