ترغب بنشر مسار تعليمي؟ اضغط هنا

Deep reinforcement learning for smart calibration of radio telescopes

306   0   0.0 ( 0 )
 نشر من قبل Sarod Yatawatta
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

Modern radio telescopes produce unprecedented amounts of data, which are passed through many processing pipelines before the delivery of scientific results. Hyperparameters of these pipelines need to be tuned by hand to produce optimal results. Because many thousands of observations are taken during a lifetime of a telescope and because each observation will have its unique settings, the fine tuning of pipelines is a tedious task. In order to automate this process of hyperparameter selection in data calibration pipelines, we introduce the use of reinforcement learning. We test two reinforcement learning techniques, twin delayed deep deterministic policy gradient (TD3) and soft actor-critic (SAC), to train an autonomous agent to perform this fine tuning. For the sake of generalization, we consider the pipeline to be a black-box system where the summarized state of the performance of the pipeline is used by the autonomous agent. The autonomous agent trained in this manner is able to determine optimal settings for diverse observations and is therefore able to perform smart calibration, minimizing the need for human intervention.



قيم البحث

اقرأ أيضاً

As a part of NASAs Heliophysics System Observatory (HSO) fleet of satellites,the Solar Dynamics Observatory (SDO) has continuously monitored the Sun since2010. Ultraviolet (UV) and Extreme UV (EUV) instruments in orbit, such asSDOs Atmospheric Imagin g Assembly (AIA) instrument, suffer time-dependent degradation which reduces instrument sensitivity. Accurate calibration for (E)UV instruments currently depends on periodic sounding rockets, which are infrequent and not practical for heliophysics missions in deep space. In the present work, we develop a Convolutional Neural Network (CNN) that auto-calibrates SDO/AIA channels and corrects sensitivity degradation by exploiting spatial patterns in multi-wavelength observations to arrive at a self-calibration of (E)UV imaging instruments. Our results remove a major impediment to developing future HSOmissions of the same scientific caliber as SDO but in deep space, able to observe the Sun from more vantage points than just SDOs current geosynchronous orbit.This approach can be adopted to perform autocalibration of other imaging systems exhibiting similar forms of degradation
At high latitudes, many cities adopt a centralized heating system to improve the energy generation efficiency and to reduce pollution. In multi-tier systems, so-called district heating, there are a few efficient approaches for the flow rate control d uring the heating process. In this paper, we describe the theoretical methods to solve this problem by deep reinforcement learning and propose a cloud-based heating control system for implementation. A real-world case study shows the effectiveness and practicability of the proposed system controlled by humans, and the simulated experiments for deep reinforcement learning show about 1985.01 gigajoules of heat quantity and 42276.45 tons of water are saved per hour compared with manual control.
This paper is concerned with algorithms for calibration of direction dependent effects (DDE) in aperture synthesis radio telescopes (ASRT). After correction of Direction Independent Effects (DIE) using self-calibration, imaging performance can be lim ited by the imprecise knowledge of the forward gain of the elements in the array. In general, the forward gain pattern is directionally dependent and varies with time due to a number of reasons. Some factors, such as rotation of the primary beam with Parallactic Angle for Azimuth-Elevation mount antennas are known a priori. Some, such as antenna pointing errors and structural deformation/projection effects for aperture-array elements cannot be measured {em a priori}. Thus, in addition to algorithms to correct for DD effects known a priori, algorithms to solve for DD gains are required for high dynamic range imaging. Here, we discuss a mathematical framework for antenna-based DDE calibration algorithms and show that this framework leads to computationally efficient optimal algorithms which scale well in a parallel computing environment. As an example of an antenna-based DD calibration algorithm, we demonstrate the Pointing SelfCal algorithm to solve for the antenna pointing errors. Our analysis show that the sensitivity of modern ASRT is sufficient to solve for antenna pointing errors and other DD effects. We also discuss the use of the Pointing SelfCal algorithm in real-time calibration systems and extensions for antenna Shape SelfCal algorithm for real-time tracking and corrections for pointing offsets and changes in antenna shape.
Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any given state and the reward predictor maps states to scalar rewards. The value function of a state can be computed as the inner product between the successor map and the reward weights. In this paper, we present DSR, which generalizes SR within an end-to-end deep reinforcement learning framework. DSR has several appealing properties including: increased sensitivity to distal reward changes due to factorization of reward and world dynamics, and the ability to extract bottleneck states (subgoals) given successor maps trained under a random policy. We show the efficacy of our approach on two diverse environments given raw pixel observations -- simple grid-world domains (MazeBase) and the Doom game engine.
Dealing with uncertainty is essential for efficient reinforcement learning. There is a growing literature on uncertainty estimation for deep learning from fixed datasets, but many of the most popular approaches are poorly-suited to sequential decisio n problems. Other methods, such as bootstrap sampling, have no mechanism for uncertainty that does not come from the observed data. We highlight why this can be a crucial shortcoming and propose a simple remedy through addition of a randomized untrainable `prior network to each ensemble member. We prove that this approach is efficient with linear representations, provide simple illustrations of its efficacy with nonlinear representations and show that this approach scales to large-scale problems far better than previous attempts.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا