No Arabic abstract
With the arrival of a number of wide-field snapshot image-plane radio transient surveys, there will be a huge influx of images in the coming years making it impossible to manually analyse the datasets. Automated pipelines to process the information stored in the images are being developed, such as the LOFAR Transients Pipeline, outputting light curves and various transient parameters. These pipelines have a number of tuneable parameters that require training to meet the survey requirements. This paper utilises both observed and simulated datasets to demonstrate different machine learning strategies that can be used to train these parameters. The datasets used are from LOFAR observations and we process the data using the LOFAR Transients Pipeline; however the strategies developed are applicable to any light curve datasets at different frequencies and can be adapted to different automated pipelines. These machine learning strategies are publicly available as Python tools that can be downloaded and adapted to different datasets (https://github.com/AntoniaR/TraP_ML_tools).
PySE is a Python software package for finding and measuring sources in radio telescope images. The software was designed to detect sources in the LOFAR telescope images, but can be used with images from other radio telescopes as well. We introduce the LOFAR Telescope, the context within which PySE was developed, the design of PySE, and describe how it is used. Detailed experiments on the validation and testing of PySE are then presented, along with results of performance testing. We discuss some of the current issues with the algorithms implemented in PySE and their inter- action with LOFAR images, concluding with the current status of PySE and its future development.
The CLEAN algorithm, widely used in radio interferometry for the deconvolution of radio images, performs well only if the raw radio image (dirty image) is, to good approximation, a simple convolution between the instrumental point-spread function (dirty beam) and the true distribution of emission across the sky. An important case in which this approximation breaks down is during frequency synthesis if the observing bandwidth is wide enough for variations in the spectrum of the sky to become significant. The convolution assumption also breaks down, in any situation but snapshot observations, if sources in the field vary significantly in flux density over the duration of the observation. Such time-variation can even be instrumental in nature, for example due to jitter or rotation of the primary beam pattern on the sky during an observation. An algorithm already exists for dealing with the spectral variation encountered in wide-band frequency synthesis interferometry. This algorithm is an extension of CLEAN in which, at each iteration, a set of N `dirty beams are fitted and subtracted in parallel, instead of just a single dirty beam as in standard CLEAN. In the wide-band algorithm the beams are obtained by expanding a nominal source spectrum in a Taylor series, each term of the series generating one of the beams. In the present paper this algorithm is extended to images which contain sources which vary over both frequency and time. Different expansion schemes (or bases) on the time and frequency axes are compared, and issues such as Gibbs ringing and non-orthogonality are discussed. It is shown that practical considerations make it often desirable to orthogonalize the set of beams before commencing the cleaning. This is easily accomplished via a Gram-Schmidt technique.
We describe a 22-year survey for variable and transient radio sources, performed with archival images taken with the Molonglo Observatory Synthesis Telescope (MOST). This survey covers $2775 unit{deg^2}$ of the sky south of $delta < -30degree$ at an observing frequency of 843 MHz, an angular resolution of $45 times 45 csc | delta| unit{arcsec^2}$ and a sensitivity of $5 sigma geq 14 unit{mJy beam^{-1}}$. We describe a technique to compensate for image gain error, along with statistical techniques to check and classify variability in a population of light curves, with applicability to any image-based radio variability survey. Among radio light curves for almost 30000 sources, we present 53 highly variable sources and 15 transient sources. Only 3 of the transient sources, and none of the variable sources have been previously identified as transient or variable. Many of our variable sources are suspected scintillating Active Galactic Nuclei. We have identified three variable sources and one transient source that are likely to be associated with star forming galaxies at $z simeq 0.05$, but whose implied luminosity is higher than the most luminous known radio supernova (SN1979C) by an order of magnitude. We also find a class of variable and transient source with no optical counterparts.
The Zwicky Transient Facility is a new robotic-observing program, in which a newly engineered 600-MP digital camera with a pioneeringly large field of view, 47~square degrees, will be installed into the 48-inch Samuel Oschin Telescope at the Palomar Observatory. The camera will generate $sim 1$~petabyte of raw image data over three years of operations. In parallel related work, new hardware and software systems are being developed to process these data in real time and build a long-term archive for the processed products. The first public release of archived products is planned for early 2019, which will include processed images and astronomical-source catalogs of the northern sky in the $g$ and $r$ bands. Source catalogs based on two different methods will be generated for the archive: aperture photometry and point-spread-function fitting.
We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS) supernova survey into real objects and artefacts. This is a first step in any transient science pipeline and is currently still done by humans, but future surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (principal component analysis, PCA) of single-epoch g, r and i-difference images, we can reach a completeness (recall) of 96 per cent, while only incorrectly classifying at most 18 per cent of artefacts as real objects, corresponding to a precision (purity) of 84 per cent. In general, random forests performed best, followed by the k-nearest neighbour and the SkyNet artificial neural net algorithms, compared to other methods such as naive Bayes and kernel support vector machine. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further improvements, especially at low signal-to-noise.