No Arabic abstract
Large modern surveys require efficient review of data in order to find transient sources such as supernovae, and to distinguish such sources from artefacts and noise. Much effort has been put into the development of automatic algorithms, but surveys still rely on human review of targets. This paper presents an integrated system for the identification of supernovae in data from Pan-STARRS1, combining classifications from volunteers participating in a citizen science project with those from a convolutional neural network. The unique aspect of this work is the deployment, in combination, of both human and machine classifications for near real-time discovery in an astronomical project. We show that the combination of the two methods outperforms either one used individually. This result has important implications for the future development of transient searches, especially in the era of LSST and other large-throughput surveys.
This paper presents our work on SNaCK, a low-dimensional concept embedding algorithm that combines human expertise with automatic machine similarity kernels. Both parts are complimentary: human insight can capture relationships that are not apparent from the objects visual similarity and the machine can help relieve the human from having to exhaustively specify many constraints. We show that our SNaCK embeddings are useful in several tasks: distinguishing prime and nonprime numbers on MNIST, discovering labeling mistakes in the Caltech UCSD Birds (CUB) dataset with the help of deep-learned features, creating training datasets for bird classifiers, capturing subjective human taste on a new dataset of 10,000 foods, and qualitatively exploring an unstructured set of pictographic characters. Comparisons with the state-of-the-art in these tasks show that SNaCK produces better concept embeddings that require less human supervision than the leading methods.
We present VOLKS2, the second release of VLBI Observation for transient Localization Keen Searcher. The pipeline aims at transient search in regular VLBI observations as well as detection of single pulses from known sources in dedicated VLBI observations. The underlying method takes the idea of geodetic VLBI data processing, including fringe fitting to maximize the signal power and geodetic VLBI solving for localization. By filtering the candidate signals with multiple windows within a baseline and by cross matching with multiple baselines, RFIs are eliminated effectively. Unlike the station auto spectrum based method, RFI flagging is not required in the VOLKS2 pipeline. EVN observation (EL060) is carried out, so as to verify the pipelines detection efficiency and localization accuracy in the whole FoV. The pipeline is parallelized with MPI and further accelerated with GPU, so as to exploit the hardware resources of modern GPU clusters. We can prove that, with proper optimization, VOLKS2 could achieve comparable performance as auto spectrum based pipelines. All the code and documents are publicly available, in the hope that our pipeline is useful for radio transient studies.
We present a comprehensive study of the effectiveness of Convolution Neural Networks (CNNs) to detect long duration transient gravitational-wave signals lasting $O(hours-days)$ from isolated neutron stars. We determine that CNNs are robust towards signal morphologies that differ from the training set, and they do not require many training injections/data to guarantee good detection efficiency and low false alarm probability. In fact, we only need to train one CNN on signal/noise maps in a single 150 Hz band; afterwards, the CNN can distinguish signals/noise well in any band, though with different efficiencies and false alarm probabilities due to the non-stationary noise in LIGO/Virgo. We demonstrate that we can control the false alarm probability for the CNNs by selecting the optimal threshold on the outputs of the CNN, which appears to be frequency dependent. Finally we compare the detection efficiencies of the networks to a well-established algorithm, the Generalized FrequencyHough (GFH), which maps curves in the time/frequency plane to lines in a plane that relates to the initial frequency/spindown of the source. The networks have similar sensitivities to the GFH but are orders of magnitude faster to run and can detect signals to which the GFH is blind. Using the results of our analysis, we propose strategies to apply CNNs to a real search using LIGO/Virgo data to overcome the obstacles that we would encounter, such as a finite amount of training data. We then use our networks and strategies to run a real search for a remnant of GW170817, making this the first time ever that a machine learning method has been applied to search for a gravitational wave signal from an isolated neutron star.
We have investigated a number of factors that can have significant impacts on the classification performance of $gamma$-ray sources detected by Fermi Large Area Telescope (LAT) with machine learning techniques. We show that a framework of automatic feature selection can construct a simple model with a small set of features which yields better performance over previous results. Secondly, because of the small sample size of the training/test sets of certain classes in $gamma$-ray, nested re-sampling and cross-validations are suggested for quantifying the statistical fluctuations of the quoted accuracy. We have also constructed a test set by cross-matching the identified active galactic nuclei (AGNs) and the pulsars (PSRs) in the Fermi LAT eight-year point source catalog (4FGL) with those unidentified sources in the previous 3$^{rm rd}$ Fermi LAT Source Catalog (3FGL). Using this cross-matched set, we show that some features used for building classification model with the identified source can suffer from the problem of covariate shift, which can be a result of various observational effects. This can possibly hamper the actual performance when one applies such model in classifying unidentified sources. Using our framework, both AGN/PSR and young pulsar (YNG)/millisecond pulsar (MSP) classifiers are automatically updated with the new features and the enlarged training samples in 4FGL catalog incorporated. Using a two-layer model with these updated classifiers, we have selected 20 promising MSP candidates with confidence scores $>98%$ from the unidentified sources in 4FGL catalog which can provide inputs for a multi-wavelength identification campaign.
As the sensitivity and observing time of gravitational-wave detectors increase, a more diverse range of signals is expected to be observed from a variety of sources. Especially, long-lived gravitational-wave transients have received interest in the last decade. Because most of long-duration signals are poorly modeled, detection must rely on generic search algorithms, which make few or no assumption on the nature of the signal. However, the computational cost of those searches remains a limiting factor, which leads to sub-optimal sensitivity. Several detection algorithms have been developed to cope with this issue. In this paper, we present a new data analysis pipeline to search for un-modeled long-lived transient gravitational-wave signals with duration between 10 and 1000 s, based on an excess cross-power statistic in a network of detectors. The pipeline implements several new features that are intended to reduce computational cost and increase detection sensitivity for a wide range of signal morphologies. The method is generalized to a network of an arbitrary number of detectors and aims to provide a stable interface for further improvements. Comparisons with a previous implementation of a similar method on simulated and real gravitational-wave data show an overall increase in detection efficiency depending on the signal morphology, and a computing time reduced by at least a factor 10.