No Arabic abstract
Computer vision based methods have been explored in the past for detection of railway track defects, but full automation has always been a challenge because both traditional image processing methods and deep learning classifiers trained from scratch fail to generalize that well to infinite novel scenarios seen in the real world, given limited amount of labeled data. Advancements have been made recently to make machine learning models utilize knowledge from a different but related domain. In this paper, we show that even though similar domain data is not available, transfer learning provides the model understanding of other real world objects and enables training production scale deep learning classifiers for uncontrolled real world data. Our models efficiently detect both track defects like sunkinks, loose ballast and railway assets like switches and signals. Models were validated with hours of track videos recorded in different continents resulting in different weather conditions, different ambience and surroundings. A track health index concept has also been proposed to monitor complete rail network.
With the railway transportation Industry moving actively towards automation, accurate location and inventory of wayside track assets like traffic signals, crossings, switches, mileposts, etc. is of extreme importance. With the new Positive Train Control (PTC) regulation coming into effect, many railway safety rules will be tied directly to location of assets like mileposts and signals. Newer speed regulations will be enforced based on location of the Train with respect to a wayside asset. Hence it is essential for the railroads to have an accurate database of the types and locations of these assets. This paper talks about a real-world use-case of detecting railway signals from a camera mounted on a moving locomotive and tracking their locations. The camera is engineered to withstand the environment factors on a moving train and provide a consistent steady image at around 30 frames per second. Using advanced image analysis and deep learning techniques, signals are detected in these camera images and a database of their locations is created. Railway signals differ a lot from road signals in terms of shapes and rules for placement with respect to track. Due to space constraint and traffic densities in urban areas signals are not placed on the same side of the track and multiple lines can run in parallel. Hence there is need to associate signal detected with the track on which the train runs. We present a method to associate the signals to the specific track they belong to using a video feed from the front facing camera mounted on the lead locomotive. A pipeline of track detection, region of interest selection, signal detection has been implemented which gives an overall accuracy of 94.7% on a route covering 150km with 247 signals.
Retinal degenerative diseases cause profound visual impairment in more than 10 million people worldwide, and retinal prostheses are being developed to restore vision to these individuals. Analogous to cochlear implants, these devices electrically stimulate surviving retinal cells to evoke visual percepts (phosphenes). However, the quality of current prosthetic vision is still rudimentary. Rather than aiming to restore natural vision, there is potential merit in borrowing state-of-the-art computer vision algorithms as image processing techniques to maximize the usefulness of prosthetic vision. Here we combine deep learning--based scene simplification strategies with a psychophysically validated computational model of the retina to generate realistic predictions of simulated prosthetic vision, and measure their ability to support scene understanding of sighted subjects (virtual patients) in a variety of outdoor scenarios. We show that object segmentation may better support scene understanding than models based on visual saliency and monocular depth estimation. In addition, we highlight the importance of basing theoretical predictions on biologically realistic models of phosphene shape. Overall, this work has the potential to drastically improve the utility of prosthetic vision for people blinded from retinal degenerative diseases.
Vision-based prediction algorithms have a wide range of applications including autonomous driving, surveillance, human-robot interaction, weather prediction. The objective of this paper is to provide an overview of the field in the past five years with a particular focus on deep learning approaches. For this purpose, we categorize these algorithms into video prediction, action prediction, trajectory prediction, body motion prediction, and other prediction applications. For each category, we highlight the common architectures, training methods and types of data used. In addition, we discuss the common evaluation metrics and datasets used for vision-based prediction tasks. A database of all the information presented in this survey including, cross-referenced according to papers, datasets and metrics, can be found online at https://github.com/aras62/vision-based-prediction.
This paper presented a deep reinforcement learning method named Double Deep Q-networks to design an end-to-end vision-based adaptive cruise control (ACC) system. A simulation environment of a highway scene was set up in Unity, which is a game engine that provided both physical models of vehicles and feature data for training and testing. Well-designed reward functions associated with the following distance and throttle/brake force were implemented in the reinforcement learning model for both internal combustion engine (ICE) vehicles and electric vehicles (EV) to perform adaptive cruise control. The gap statistics and total energy consumption are evaluated for different vehicle types to explore the relationship between reward functions and powertrain characteristics. Compared with the traditional radar-based ACC systems or human-in-the-loop simulation, the proposed vision-based ACC system can generate either a better gap regulated trajectory or a smoother speed trajectory depending on the preset reward function. The proposed system can be well adaptive to different speed trajectories of the preceding vehicle and operated in real-time.
Regular maintenance of all the assets is pivotal for proper functioning of railway. Manual maintenance can be very cumbersome and leave room for errors. Track anomalies like vegetation overgrowth, sun kinks affect the track construct and result in unequal load transfer, imbalanced lateral forces on tracks which causes further deterioration of tracks and can ultimately result in derailment of locomotive. Hence there is a need to continuously monitor rail track health. Track anomalies are rare with the skew as high as one anomaly in millions of good images. We propose a method to build training data that will make our algorithms more robust and help us detect real world track issues. The data augmentation will have a direct effect in making us detect better anomalies and hence improve time for railroads that is spent in manual inspection. This paper talks about a real world use case of detecting railway track defects from a camera mounted on a moving locomotive and tracking their locations. The camera is engineered to withstand the environment factors on a moving train and provide a consistent steady image at around 30 frames per second. An image simulation pipeline of track detection, region of interest selection, augmenting image for anomalies is implemented. Training images are simulated for sun kink and vegetation overgrowth. Inception V3 model pretrained on Imagenet dataset is finetuned for a 2 class classification. For the case of vegetation overgrowth, the model generalizes well on actual vegetation images, though it was trained and validated solely on simulated images which might have different distribution than the actual vegetation. Sun kink classifier can classify professionally simulated sun kink videos with a precision of 97.5%.