No Arabic abstract
We describe the application of the supervised machine-learning algorithms to identify the likely multi-wavelength counterparts to submillimeter sources detected in panoramic, single-dish submillimeter surveys. As a training set, we employ a sample of 695 ($S_{rm 870mu m}$ >1 mJy) submillimeter galaxies (SMGs) with precise identifications from the ALMA follow-up of the SCUBA-2 Cosmology Legacy Surveys UKIDSS-UDS field (AS2UDS). We show that radio emission, near-/mid-infrared colors, photometric redshift, and absolute $H$-band magnitude are effective predictors that can distinguish SMGs from submillimeter-faint field galaxies. Our combined radio+machine-learning method is able to successfully recover $sim$85 percent of ALMA-identified SMGs which are detected in at least three bands from the ultraviolet to radio. We confirm the robustness of our method by dividing our training set into independent subsets and using these for training and testing respectively, as well as applying our method to an independent sample of $sim$100 ALMA-identified SMGs from the ALMA/LABOCA ECDF-South Survey (ALESS). To further test our methodology, we stack the 870$mu$m ALMA maps at the positions of those $K$-band galaxies that are classified as SMG counterparts by the machine-learning but do not have a $>$4.3$ sigma$ ALMA detection. The median peak flux density of these galaxies is $S_{rm 870mu m}=(0.61pm0.03)$ mJy, demonstrating that our method can recover faint and/or diffuse SMGs even when they are below the detection threshold of our ALMA observations. In future, we will apply this method to samples drawn from panoramic single-dish submillimeter surveys which currently lack interferometric follow-up observations, to address science questions which can only be tackled with large, statistical samples of SMGs.
We identify multi-wavelength counterparts to 1,147 submillimeter sources from the S2COSMOS SCUBA-2 survey of the COSMOS field by employing a recently developed radio$+$machine-learning method trained on a large sample of ALMA-identified submillimeter galaxies (SMGs), including 260 SMGs identified in the AS2COSMOS pilot survey. In total, we identify 1,222 optical/near-infrared(NIR)/radio counterparts to the 897 S2COSMOS submillimeter sources with S$_{850}$>1.6mJy, yielding an overall identification rate of ($78pm9$)%. We find that ($22pm5$)% of S2COSMOS sources have multiple identified counterparts. We estimate that roughly 27% of these multiple counterparts within the same SCUBA-2 error circles very likely arise from physically associated galaxies rather than line-of-sight projections by chance. The photometric redshift of our radio$+$machine-learning identified SMGs ranges from z=0.2 to 5.7 and peaks at $z=2.3pm0.1$. The AGN fraction of our sample is ($19pm4$)%, which is consistent with that of ALMA SMGs in the literature. Comparing with radio/NIR-detected field galaxy population in the COSMOS field, our radio+machine-learning identified counterparts of SMGs have the highest star-formation rates and stellar masses. These characteristics suggest that our identified counterparts of S2COSMOS sources are a representative sample of SMGs at z<3. We employ our machine-learning technique to the whole COSMOS field and identified 6,877 potential SMGs, most of which are expected to have submillimeter emission fainter than the confusion limit of our S2COSMOS surveys (S$_{850}$<1.5mJy). We study the clustering properties of SMGs based on this statistically large sample, finding that they reside in high-mass dark matter halos ($(1.2pm0.3)times10^{13},h^{-1},rm M_{odot}$), which suggests that SMGs may be the progenitors of massive ellipticals we see in the local Universe.
We use machine learning techniques to investigate their performance in classifying active galactic nuclei (AGNs), including X-ray selected AGNs (XAGNs), infrared selected AGNs (IRAGNs), and radio selected AGNs (RAGNs). Using known physical parameters in the Cosmic Evolution Survey (COSMOS) field, we are able to well-established training samples in the region of Hyper Suprime-Cam (HSC) survey. We compare several Python packages (e.g., scikit-learn, Keras, and XGBoost), and use XGBoost to identify AGNs and show the performance (e.g., accuracy, precision, recall, F1 score, and AUROC). Our results indicate that the performance is high for bright XAGN and IRAGN host galaxies. The combination of the HSC (optical) information with the Wide-field Infrared Survey Explorer (WISE) band-1 and WISE band-2 (near-infrared) information perform well to identify AGN hosts. For both type-1 (broad-line) XAGNs and type-1 (unobscured) IRAGNs, the performance is very good by using optical to infrared information. These results can apply to the five-band data from the wide regions of the HSC survey, and future all-sky surveys.
We present a search for Herschel-PACS counterparts of dust-obscured, high-redshift objects previously selected at submillimeter and millimeter wavelengths in the Great Observatories Origins Deep Survey North field. We detect 22 of 56 submillimeter galaxies (39%) with a SNR of >=3 at 100 micron down to 3.0 mJy, and/or at 160 micron down to 5.7 mJy. The fraction of SMGs seen at 160 micron is higher than that at 100 micron. About 50% of radio-identified SMGs are associated with PACS sources. We find a trend between the SCUBA/PACS flux ratio and redshift, suggesting that these flux ratios could be used as a coarse redshift indicator. PACS undetected submm/mm selected sources tend to lie at higher redshifts than the PACS detected ones. A total of 12 sources (21% of our SMG sample) remain unidentified and the fact that they are blank fields at Herschel-PACS and VLA 20 cm wavelength may imply higher redshifts for them than for the average SMG population (e.g., z>3-4). The Herschel-PACS imaging of these dust-obscured starbursts at high-redshifts suggests that their far-infrared spectral energy distributions have significantly different shapes than template libraries of local infrared galaxies.
Large photometric surveys provide a rich source of observations of quiescent galaxies, including a surprisingly large population at z>1. However, identifying large, but clean, samples of quiescent galaxies has proven difficult because of their near-degeneracy with interlopers such as dusty, star-forming galaxies. We describe a new technique for selecting quiescent galaxies based upon t-distributed stochastic neighbor embedding (t-SNE), an unsupervised machine learning algorithm for dimensionality reduction. This t-SNE selection provides an improvement both over UVJ, removing interlopers which otherwise would pass color selection, and over photometric template fitting, more strongly towards high redshift. Due to the similarity between the colors of high- and low-redshift quiescent galaxies, under our assumptions t-SNE outperforms template fitting in 63% of trials at redshifts where a large training sample already exists. It also may be able to select quiescent galaxies more efficiently at higher redshifts than the training sample.
We present a multi-wavelength analysis of 48 submillimeter galaxies (SMGs) detected in the LABOCA/ACT Survey of Clusters at All Redshifts, LASCAR, which acquired new 870 $mu$m and ATCA 2.1 GHz observations of ten galaxy clusters detected through their Sunyaev-Zeldovich effect (SZE) signal by the Atacama Cosmology Telescope. Far-infrared observations were also conducted with the PACS (100/160 $mu$m) and SPIRE (250/350/500 $mu$m) instruments on $Herschel$ for sample subsets of five and six clusters. LASCAR 870 $mu$m, maps were reduced using a multi-scale iterative pipeline that removes the SZE increment signal, yielding point-source sensitivities of $sigmasim2rm{,mJy,beam}^{-1}$. We detect in total 49 sources at the $4sigma$ level, and conduct a detailed multi-wavelength analysis considering our new radio and far-IR observations plus existing near-IR and optical data. One source is identified as a foreground galaxy, 28 SMGs are matched to single radio sources, 4 have double radio counterparts, and 16 are undetected at 2.1 GHz but tentatively associated in some cases to near-IR/optical sources. We estimate photometric redshifts for 34 sources with secure (25) and tentative (9) matches at different wavelengths, obtaining a median $z=2.8^{+2.1}_{-1.7}$. Compared to previous results for single-dish surveys, our redshift distribution has a comparatively larger fraction of sources at $z>3$ and the high-redshift tail is more extended. This is consistent with millimeter spectroscopic confirmation of a growing number of high-$z$ SMGs and relevant for testing of cosmological models. Analytical lens modeling is applied to estimate magnification factors for 42 SMGs at cluster-centric radii $>1.2$; with the demagnified flux densities and source-plane areas, we obtain integral number counts that agree with previous submillimeter surveys.