No Arabic abstract
It is usually hard for a learning system to predict correctly on rare events that never occur in the training data, and there is no exception for segmentation algorithms. Meanwhile, manual inspection of each case to locate the failures becomes infeasible due to the trend of large data scale and limited human resource. Therefore, we build an alarm system that will set off alerts when the segmentation result is possibly unsatisfactory, assuming no corresponding ground truth mask is provided. One plausible solution is to project the segmentation results into a low dimensional feature space; then learn classifiers/regressors to predict their qualities. Motivated by this, in this paper, we learn a feature space using the shape information which is a strong prior shared among different datasets and robust to the appearance variation of input data.The shape feature is captured using a Variational Auto-Encoder (VAE) network that trained with only the ground truth masks. During testing, the segmentation results with bad shapes shall not fit the shape prior well, resulting in large loss values. Thus, the VAE is able to evaluate the quality of segmentation result on unseen data, without using ground truth. Finally, we learn a regressor in the one-dimensional feature space to predict the qualities of segmentation results. Our alarm system is evaluated on several recent state-of-art segmentation algorithms for 3D medical segmentation tasks. Compared with other standard quality assessment methods, our system consistently provides more reliable prediction on the qualities of segmentation results.
This paper proposes a Genetic Algorithm based segmentation method that can automatically segment gray-scale images. The proposed method mainly consists of spatial unsupervised grayscale image segmentation that divides an image into regions. The aim of this algorithm is to produce precise segmentation of images using intensity information along with neighborhood relationships. In this paper, Fuzzy Hopfield Neural Network (FHNN) clustering helps in generating the population of Genetic algorithm which there by automatically segments the image. This technique is a powerful method for image segmentation and works for both single and multiple-feature data with spatial information. Validity index has been utilized for introducing a robust technique for finding the optimum number of components in an image. Experimental results shown that the algorithm generates good quality segmented image.
Despite the constant advances in computer vision, integrating modern single-image detectors in real-time handgun alarm systems in video-surveillance is still debatable. Using such detectors still implies a high number of false alarms and false negatives. In this context, most existent studies select one of the latest single-image detectors and train it on a better dataset or use some pre-processing, post-processing or data-fusion approach to further reduce false alarms. However, none of these works tried to exploit the temporal information present in the videos to mitigate false detections. This paper presents a new system, called MULTI Confirmation-level Alarm SysTem based on Convolutional Neural Networks (CNN) and Long Short Term Memory networks (LSTM) (MULTICAST), that leverages not only the spacial information but also the temporal information existent in the videos for a more reliable handgun detection. MULTICAST consists of three stages, i) a handgun detection stage, ii) a CNN-based spacial confirmation stage and iii) LSTM-based temporal confirmation stage. The temporal confirmation stage uses the positions of the detected handgun in previous instants to predict its trajectory in the next frame. Our experiments show that MULTICAST reduces by 80% the number of false alarms with respect to Faster R-CNN based-single-image detector, which makes it more useful in providing more effective and rapid security responses.
Convex Shapes (CS) are common priors for optic disc and cup segmentation in eye fundus images. It is important to design proper techniques to represent convex shapes. So far, it is still a problem to guarantee that the output objects from a Deep Neural Convolution Networks (DCNN) are convex shapes. In this work, we propose a technique which can be easily integrated into the commonly used DCNNs for image segmentation and guarantee that outputs are convex shapes. This method is flexible and it can handle multiple objects and allow some of the objects to be convex. Our method is based on the dual representation of the sigmoid activation function in DCNNs. In the dual space, the convex shape prior can be guaranteed by a simple quadratic constraint on a binary representation of the shapes. Moreover, our method can also integrate spatial regularization and some other shape prior using a soft thresholding dynamics (STD) method. The regularization can make the boundary curves of the segmentation objects to be simultaneously smooth and convex. We design a very stable active set projection algorithm to numerically solve our model. This algorithm can form a new plug-and-play DCNN layer called CS-STD whose outputs must be a nearly binary segmentation of convex objects. In the CS-STD block, the convexity information can be propagated to guide the DCNN in both forward and backward propagation during training and prediction process. As an application example, we apply the convexity prior layer to the retinal fundus images segmentation by taking the popular DeepLabV3+ as a backbone network. Experimental results on several public datasets show that our method is efficient and outperforms the classical DCNN segmentation methods.
We desgin a novel fully convolutional network architecture for shapes, denoted by Shape Fully Convolutional Networks (SFCN). 3D shapes are represented as graph structures in the SFCN architecture, based on novel graph convolution and pooling operations, which are similar to convolution and pooling operations used on images. Meanwhile, to build our SFCN architecture in the original image segmentation fully convolutional network (FCN) architecture, we also design and implement a generating operation} with bridging function. This ensures that the convolution and pooling operation we have designed can be successfully applied in the original FCN architecture. In this paper, we also present a new shape segmentation approach based on SFCN. Furthermore, we allow more general and challenging input, such as mixed datasets of different categories of shapes} which can prove the ability of our generalisation. In our approach, SFCNs are trained triangles-to-triangles by using three low-level geometric features as input. Finally, the feature voting-based multi-label graph cuts is adopted to optimise the segmentation results obtained by SFCN prediction. The experiment results show that our method can effectively learn and predict mixed shape datasets of either similar or different characteristics, and achieve excellent segmentation results.
In this paper, we propose a novel top-down instance segmentation framework based on explicit shape encoding, named textbf{ESE-Seg}. It largely reduces the computational consumption of the instance segmentation by explicitly decoding the multiple object shapes with tensor operations, thus performs the instance segmentation at almost the same speed as the object detection. ESE-Seg is based on a novel shape signature Inner-center Radius (IR), Chebyshev polynomial fitting and the strong modern object detectors. ESE-Seg with YOLOv3 outperforms the Mask R-CNN on Pascal VOC 2012 at mAP$^
[email protected] while 7 times faster.