No Arabic abstract
Because exoplanets are extremely dim, an Electron Multiplying Charged Coupled Device (EMCCD) operating in photon counting (PC) mode is necessary to reduce the detector noise level and enable their detection. Typically, PC images are added together as a co-added image before processing. We present here a signal detection and estimation technique that works directly with individual PC images. The method is based on the generalized likelihood ratio test (GLRT) and uses a Bernoulli distribution between PC images. The Bernoulli distribution is derived from a stochastic model for the detector, which accurately represents its noise characteristics. We show that our technique outperforms a previously used GLRT method that relies on co-added images under a Gaussian noise assumption and two detection algorithms based on signal-to-noise ratio (SNR). Furthermore, our method provides the maximum likelihood estimate of exoplanet intensity and background intensity while doing detection. It can be applied online, so it is possible to stop observations once a specified threshold is reached, providing confidence for the existence (or absence) of planets. As a result, the observation time is efficiently used. Besides the observation time, the analysis of detection performance introduced in the paper also gives quantitative guidance on the choice of imaging parameters, such as the threshold. Lastly, though this work focuses on the example of detecting point source, the framework is widely applicable.
A starshade suppresses starlight by a factor of 1E11 in the image plane of a telescope, which is crucial for directly imaging Earth-like exoplanets. The state of the art in high contrast post-processing and signal detection methods were developed specifically for images taken with an internal coronagraph system and focus on the removal of quasi-static speckles. These methods are less useful for starshade images where such speckles are not present. This paper is dedicated to investigating signal processing methods tailored to work efficiently on starshade images. We describe a signal detection method, the generalized likelihood ratio test (GLRT), for starshade missions and look into three important problems. First, even with the light suppression provided by the starshade, rocky exoplanets are still difficult to detect in reflected light due to their absolute faintness. GLRT can successfully flag these dim planets. Moreover, GLRT provides estimates of the planets positions and intensities and the theoretical false alarm rate of the detection. Second, small starshade shape errors, such as a truncated petal tip, can cause artifacts that are hard to distinguish from real planet signals; the detection method can help distinguish planet signals from such artifacts. The third direct imaging problem is that exozodiacal dust degrades detection performance. We develop an iterative generalized likelihood ratio test to mitigate the effect of dust on the image. In addition, we provide guidance on how to choose the number of photon counting images to combine into one co-added image before doing detection, which will help utilize the observation time efficiently. All the methods are demonstrated on realistic simulated images.
This paper presents results on the detection and identification mango fruits from colour images of trees. We evaluate the behaviour and the performances of the Faster R-CNN network to determine whether it is robust enough to detect and classify fruits under particularly heterogeneous conditions in terms of plant cultivars, plantation scheme, and visual information acquisition contexts. The network is trained to distinguish the Kent, Keitt, and Boucodiekhal mango cultivars from 3,000 representative labelled fruit annotations. The validation set composed of about 7,000 annotations was then tested with a confidence threshold of 0.7 and a Non-Maximal-Suppression threshold of 0.25. With a F1-score of 0.90, the Faster R-CNN is well suitable to the simple fruit detection in tiles of 500x500 pixels. We then combine a multi-tiling approach with a Jaccard matrix to merge the different parts of objects detected several times, and thus report the detections made at the tile scale to the native 6,000x4,000 pixel size images. Nonetheless with a F1-score of 0.56, the cultivar identification Faster R-CNN network presents some limitations for simultaneously detecting the mango fruits and identifying their respective cultivars. Despite the proven errors in fruit detection, the cultivar identification rates of the detected mango fruits are in the order of 80%. The ideal solution could combine a Mask R-CNN for the image pre-segmentation of trees and a double-stream Faster R-CNN for detecting the mango fruits and identifying their respective cultivar to provide predictions more relevant to users expectations.
This study aims to evaluate the performance of power in the likelihood ratio test for changepoint detection by bootstrap sampling, and proposes a hypothesis test based on bootstrapped confidence interval lengths. Assuming i.i.d normally distributed errors, and using the bootstrap method, the changepoint sampling distribution is estimated. Furthermore, this study describes a method to estimate a data set with no changepoint to form the null sampling distribution. With the null sampling distribution, and the distribution of the estimated changepoint, critical values and power calculations can be made, over the lengths of confidence intervals.
The complexity underlying real-world systems implies that standard statistical hypothesis testing methods may not be adequate for these peculiar applications. Specifically, we show that the likelihood-ratio tests null-distribution needs to be modified to accommodate the complexity found in multi-edge network data. When working with independent observations, the p-values of likelihood-ratio tests are approximated using a $chi^2$ distribution. However, such an approximation should not be used when dealing with multi-edge network data. This type of data is characterized by multiple correlations and competitions that make the standard approximation unsuitable. We provide a solution to the problem by providing a better approximation of the likelihood-ratio test null-distribution through a Beta distribution. Finally, we empirically show that even for a small multi-edge network, the standard $chi^2$ approximation provides erroneous results, while the proposed Beta approximation yields the correct p-value estimation.
Multivariate linear regressions are widely used statistical tools in many applications to model the associations between multiple related responses and a set of predictors. To infer such associations, it is often of interest to test the structure of the regression coefficients matrix, and the likelihood ratio test (LRT) is one of the most popular approaches in practice. Despite its popularity, it is known that the classical $chi^2$ approximations for LRTs often fail in high-dimensional settings, where the dimensions of responses and predictors $(m,p)$ are allowed to grow with the sample size $n$. Though various corrected LRTs and other test statistics have been proposed in the literature, the fundamental question of when the classic LRT starts to fail is less studied, an answer to which would provide insights for practitioners, especially when analyzing data with $m/n$ and $p/n$ small but not negligible. Moreover, the power performance of the LRT in high-dimensional data analysis remains underexplored. To address these issues, the first part of this work gives the asymptotic boundary where the classical LRT fails and develops the corrected limiting distribution of the LRT for a general asymptotic regime. The second part of this work further studies the test power of the LRT in the high-dimensional setting. The result not only advances the current understanding of asymptotic behavior of the LRT under alternative hypothesis, but also motivates the development of a power-enhanced LRT. The third part of this work considers the setting with $p>n$, where the LRT is not well-defined. We propose a two-step testing procedure by first performing dimension reduction and then applying the proposed LRT. Theoretical properties are developed to ensure the validity of the proposed method. Numerical studies are also presented to demonstrate its good performance.