Do you want to publish a course? Click here

Vision-based Automated Bridge Component Recognition Integrated With High-level Scene Understanding

47   0   0.0 ( 0 )
 Added by Yasutaka Narazaki
 Publication date 2018
and research's language is English




Ask ChatGPT about the research

Image data has a great potential of helping conventional visual inspections of civil engineering structures due to the ease of data acquisition and the advantages in capturing visual information. A variety of techniques have been proposed to detect damages, such as cracks and spalling on a close-up image of a single component (columns and road surfaces etc.). However, these techniques commonly suffer from severe false-positives especially when the image includes multiple components of different structures. To reduce the false-positives and extract reliable information about the structures conditions, detection and localization of critical structural components are important first steps preceding the damage assessment. This study aims at recognizing bridge structural and non-structural components from images of urban scenes. During the bridge component recognition, every image pixel is classified into one of the five classes (non-bridge, columns, beams and slabs, other structural, other nonstructural) by multi-scale convolutional neural networks (multi-scale CNNs). To reduce false-positives and get consistent labels, the component classifications are integrated with scene understanding by an additional classifier with 10 higher-level scene classes (building, greenery, person, pavement, signs and poles, vehicles, bridges, water, sky, and others). The bridge component recognition integrated with the scene understanding is compared with the naive approach without scene classification in terms of accuracy, false-positives and consistencies to demonstrate the effectiveness of the integrated approach.



rate research

Read More

Image data has a great potential of helping post-earthquake visual inspections of civil engineering structures due to the ease of data acquisition and the advantages in capturing visual information. A variety of techniques have been applied to detect damages automatically from a close-up image of a structural component. However, the application of the automatic damage detection methods become increasingly difficult when the image includes multiple components from different structures. To reduce the inaccurate false positive alarms, critical structural components need to be recognized first, and the damage alarms need to be cleaned using the component recognition results. To achieve the goal, this study aims at recognizing and extracting bridge components from images of urban scenes. The bridge component recognition begins with pixel-wise classifications of an image into 10 scene classes. Then, the original image and the scene classification results are combined to classify the image pixels into five component classes. The multi-scale convolutional neural networks (multi-scale CNNs) are used to perform pixel-wise classification, and the classification results are post-processed by averaging within superpixels and smoothing by conditional random fields (CRFs). The performance of the bridge component extraction is tested in terms of accuracy and consistency.
This paper investigates the automated recognition of structural bridge components using video data. Although understanding video data for structural inspections is straightforward for human inspectors, the implementation of the same task using machine learning methods has not been fully realized. In particular, single-frame image processing techniques, such as convolutional neural networks (CNNs), are not expected to identify structural components accurately when the image is a close-up view, lacking contextual information regarding where on the structure the image originates. Inspired by the significant progress in video processing techniques, this study investigates automated bridge component recognition using video data, where the information from the past frames is used to augment the understanding of the current frame. A new simulated video dataset is created to train the machine learning algorithms. Then, convolutional Neural Networks (CNNs) with recurrent architectures are designed and applied to implement the automated bridge component recognition task. Results are presented for simulated video data, as well as video collected in the field.
Abstract--- Exploiting the spatial structure in scene images is a key research direction for scene recognition. Due to the large intra-class structural diversity, building and modeling flexible structural layout to adapt various image characteristics is a challenge. Existing structural modeling methods in scene recognition either focus on predefined grids or rely on learned prototypes, which all have limited representative ability. In this paper, we propose Prototype-agnostic Scene Layout (PaSL) construction method to build the spatial structure for each image without conforming to any prototype. Our PaSL can flexibly capture the diverse spatial characteristic of scene images and have considerable generalization capability. Given a PaSL, we build Layout Graph Network (LGN) where regions in PaSL are defined as nodes and two kinds of independent relations between regions are encoded as edges. The LGN aims to incorporate two topological structures (formed in spatial and semantic similarity dimensions) into image representations through graph convolution. Extensive experiments show that our approach achieves state-of-the-art results on widely recognized MIT67 and SUN397 datasets without multi-model or multi-scale fusion. Moreover, we also conduct the experiments on one of the largest scale datasets, Places365. The results demonstrate the proposed method can be well generalized and obtains competitive performance.
77 - Nicole Han 2021
Retinal degenerative diseases cause profound visual impairment in more than 10 million people worldwide, and retinal prostheses are being developed to restore vision to these individuals. Analogous to cochlear implants, these devices electrically stimulate surviving retinal cells to evoke visual percepts (phosphenes). However, the quality of current prosthetic vision is still rudimentary. Rather than aiming to restore natural vision, there is potential merit in borrowing state-of-the-art computer vision algorithms as image processing techniques to maximize the usefulness of prosthetic vision. Here we combine deep learning--based scene simplification strategies with a psychophysically validated computational model of the retina to generate realistic predictions of simulated prosthetic vision, and measure their ability to support scene understanding of sighted subjects (virtual patients) in a variety of outdoor scenarios. We show that object segmentation may better support scene understanding than models based on visual saliency and monocular depth estimation. In addition, we highlight the importance of basing theoretical predictions on biologically realistic models of phosphene shape. Overall, this work has the potential to drastically improve the utility of prosthetic vision for people blinded from retinal degenerative diseases.
Panorama images have a much larger field-of-view thus naturally encode enriched scene context information compared to standard perspective images, which however is not well exploited in the previous scene understanding methods. In this paper, we propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view panorama image. In order to fully utilize the rich context information, we design a novel graph neural network based context model to predict the relationship among objects and room layout, and a differentiable relationship-based optimization module to optimize object arrangement with well-designed objective functions on-the-fly. Realizing the existing data are either with incomplete ground truth or overly-simplified scene, we present a new synthetic dataset with good diversity in room layout and furniture placement, and realistic image quality for total panoramic 3D scene understanding. Experiments demonstrate that our method outperforms existing methods on panoramic scene understanding in terms of both geometry accuracy and object arrangement. Code is available at https://chengzhag.github.io/publication/dpc.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا