An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images

109 0 0.0 ( 0 )

Download Cite

Added by Lei Chu

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Zenan Ling - Robert C. Qiu - Zhijian Jin

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The location of broken insulators in aerial images is a challenging task. This paper, focusing on the self-blast glass insulator, proposes a deep learning solution. We address the broken insulators location problem as a low signal-noise-ratio image location framework with two modules: 1) object detection based on Fast R-CNN, and 2) classification of pixels based on U-net. A diverse aerial image set of some grid in China is tested to validated the proposed approach. Furthermore, a comparison is made among different methods and the result shows that our approach is accurate and real-time.

rate research

CFUN: Combining Faster R-CNN and U-net Network for Efficient Whole Heart Segmentation

229 - Zhanwei Xu , Ziyi Wu , Jianjiang Feng 2018

In this paper, we propose a novel heart segmentation pipeline Combining Faster R-CNN and U-net Network (CFUN). Due to Faster R-CNNs precise localization ability and U-nets powerful segmentation ability, CFUN needs only one-step detection and segmentation inference to get the whole heart segmentation result, obtaining good results with significantly reduced computational cost. Besides, CFUN adopts a new loss function based on edge information named 3D Edge-loss as an auxiliary loss to accelerate the convergence of training and improve the segmentation results. Extensive experiments on the public dataset show that CFUN exhibits competitive segmentation performance in a sharply reduced inference time. Our source code and the model are publicly available at https://github.com/Wuziyi616/CFUN.

Computer Vision and Pattern Recognition

Reprojection R-CNN: A Fast and Accurate Object Detector for 360{deg} Images

303 - Pengyu Zhao , Ansheng You , Yuanxing Zhang 2019

360{deg} images are usually represented in either equirectangular projection (ERP) or multiple perspective projections. Different from the flat 2D images, the detection task is challenging for 360{deg} images due to the distortion of ERP and the inefficiency of perspective projections. However, existing methods mostly focus on one of the above representations instead of both, leading to limited detection performance. Moreover, the lack of appropriate bounding-box annotations as well as the annotated datasets further increases the difficulties of the detection task. In this paper, we present a standard object detection framework for 360{deg} images. Specifically, we adapt the terminologies of the traditional object detection task to the omnidirectional scenarios, and propose a novel two-stage object detector, i.e., Reprojection R-CNN by combining both ERP and perspective projection. Owing to the omnidirectional field-of-view of ERP, Reprojection R-CNN first generates coarse region proposals efficiently by a distortion-aware spherical region proposal network. Then, it leverages the distortion-free perspective projection and refines the proposed regions by a novel reprojection network. We construct two novel synthetic datasets for training and evaluation. Experiments reveal that Reprojection R-CNN outperforms the previous state-of-the-art methods on the mAP metric. In addition, the proposed detector could run at 178ms per image in the panoramic datasets, which implies its practicability in real-world applications.

Computer Vision and Pattern Recognition

CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images

100 - Yudong Guo , Juyong Zhang , Jianfei Cai 2017

With the powerfulness of convolution neural networks (CNN), CNN based face reconstruction has recently shown promising performance in reconstructing detailed face shape from 2D face images. The success of CNN-based methods relies on a large number of labeled data. The state-of-the-art synthesizes such data using a coarse morphable face model, which however has difficulty to generate detailed photo-realistic images of faces (with wrinkles). This paper presents a novel face data generation method. Specifically, we render a large number of photo-realistic face images with different attributes based on inverse rendering. Furthermore, we construct a fine-detailed face image dataset by transferring different scales of details from one image to another. We also construct a large number of video-type adjacent frame pairs by simulating the distribution of real video data. With these nicely constructed datasets, we propose a coarse-to-fine learning framework consisting of three convolutional networks. The networks are trained for real-time detailed 3D face reconstruction from monocular video as well as from a single image. Extensive experimental results demonstrate that our framework can produce high-quality reconstruction but with much less computation time compared to the state-of-the-art. Moreover, our method is robust to pose, expression and lighting due to the diversity of data.

Computer Vision and Pattern Recognition

Motorcycle detection and classification in urban Scenarios using a model based on Faster R-CNN

91 - Jorge E. Espinosa , Sergio A. Velastin (2 , 3 2018

This paper introduces a Deep Learning Convolutional Neural Network model based on Faster-RCNN for motorcycle detection and classification on urban environments. The model is evaluated in occluded scenarios where more than 60% of the vehicles present a degree of occlusion. For training and evaluation, we introduce a new dataset of 7500 annotated images, captured under real traffic scenes, using a drone mounted camera. Several tests were carried out to design the network, achieving promising results of 75% in average precision (AP), even with the high number of occluded motorbikes, the low angle of capture and the moving camera. The model is also evaluated on low occlusions datasets, reaching results of up to 92% in AP.

Computer Vision and Pattern Recognition

Real-time Denoising and Dereverberation with Tiny Recurrent U-Net

114 - Hyeong-Seok Choi , Sungjin Park , Jie Hwan Lee 2021

Modern deep learning-based models have seen outstanding performance improvement with speech enhancement tasks. The number of parameters of state-of-the-art models, however, is often too large to be deployed on devices for real-world applications. To this end, we propose Tiny Recurrent U-Net (TRU-Net), a lightweight online inference model that matches the performance of current state-of-the-art models. The size of the quantized version of TRU-Net is 362 kilobytes, which is small enough to be deployed on edge devices. In addition, we combine the small-sized model with a new masking method called phase-aware $beta$-sigmoid mask, which enables simultaneous denoising and dereverberation. Results of both objective and subjective evaluations have shown that our model can achieve competitive performance with the current state-of-the-art models on benchmark datasets using fewer parameters by orders of magnitude.

Sound Artificial Intelligence Audio and Speech Processing