Rail-5k: a Real-World Dataset for Rail Surface Defects Detection

135 0 0.0 ( 0 )

Download Cite

Added by Bingchen Zhao

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Zihao Zhang - Shaozuo Yu - Siwei Yang

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper presents the Rail-5k dataset for benchmarking the performance of visual algorithms in a real-world application scenario, namely the rail surface defects detection task. We collected over 5k high-quality images from railways across China, and annotated 1100 images with the help from railway experts to identify the most common 13 types of rail defects. The dataset can be used for two settings both with unique challenges, the first is the fully-supervised setting using the 1k+ labeled images for training, fine-grained nature and long-tailed distribution of defect classes makes it hard for visual algorithms to tackle. The second is the semi-supervised learning setting facilitated by the 4k unlabeled images, these 4k images are uncurated containing possible image corruptions and domain shift with the labeled images, which can not be easily tackle by previous semi-supervised learning methods. We believe our dataset could be a valuable benchmark for evaluating robustness and reliability of visual algorithms.

rate research

Automatic Detection of Rail Components via A Deep Convolutional Transformer Network

348 - Tiange Wang , Zijun Zhang , Fangfang Yang 2021

Automatic detection of rail track and its fasteners via using continuously collected railway images is important to maintenance as it can significantly improve maintenance efficiency and better ensure system safety. Dominant computer vision-based detection models typically rely on convolutional neural networks that utilize local image features and cumbersome prior settings to generate candidate boxes. In this paper, we propose a deep convolutional transformer network based method to detect multi-class rail components including the rail, clip, and bolt. We effectively synergize advantages of the convolutional structure on extracting latent features from raw images as well as advantages of transformers on selectively determining valuable latent features to achieve an efficient and accurate performance on rail component detections. Our proposed method simplifies the detection pipeline by eliminating the need of prior settings, such as anchor box, aspect ratio, default coordinates, and post-processing, such as the threshold for non-maximum suppression; as well as allows users to trade off the quality and complexity of the detector with limited training data. Results of a comprehensive computational study show that our proposed method outperforms a set of existing state-of-art approaches with large margins

Computer Vision and Pattern Recognition

DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection

472 - Liming Jiang , Ren Li , Wayne Wu 2020

We present our on-going effort of constructing a large-scale benchmark for face forgery detection. The first version of this benchmark, DeeperForensics-1.0, represents the largest face forgery detection dataset by far, with 60,000 videos constituted by a total of 17.6 million frames, 10 times larger than existing datasets of the same kind. Extensive real-world perturbations are applied to obtain a more challenging benchmark of larger scale and higher diversity. All source videos in DeeperForensics-1.0 are carefully collected, and fake videos are generated by a newly proposed end-to-end face swapping framework. The quality of generated videos outperforms those in existing datasets, validated by user studies. The benchmark features a hidden test set, which contains manipulated videos achieving high deceptive scores in human evaluations. We further contribute a comprehensive study that evaluates five representative detection baselines and make a thorough analysis of different settings.

Computer Vision and Pattern Recognition Machine Learning

RAIL: Risk-Averse Imitation Learning

124 - Anirban Santara , Abhishek Naik , Balaraman Ravindran 2017

Imitation learning algorithms learn viable policies by imitating an experts behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the experts behavior is available as a fixed set of trajectories. We evaluate in terms of the experts cost function and observe that the distribution of trajectory-costs is often more heavy-tailed for GAIL-agents than the expert at a number of benchmark continuous-control tasks. Thus, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by the GAIL-agents than the expert. This makes the reliability of GAIL-agents questionable when it comes to deployment in risk-sensitive applications like robotic surgery and autonomous driving. In this work, we aim to minimize the occurrence of tail-end events by minimizing tail risk within the GAIL framework. We quantify tail risk by the Conditional-Value-at-Risk (CVaR) of trajectories and develop the Risk-Averse Imitation Learning (RAIL) algorithm. We observe that the policies learned with RAIL show lower tail-end risk than those of vanilla GAIL. Thus the proposed RAIL algorithm appears as a potent alternative to GAIL for improved reliability in risk-sensitive applications.

Machine Learning Artificial Intelligence

ORBIT: A Real-World Few-Shot Dataset for Teachable Object Recognition

197 - Daniela Massiceti , Luisa Zintgraf , John Bronskill 2021

Object recognition has made great advances in the last decade, but predominately still relies on many high-quality training examples per object category. In contrast, learning new objects from only a few examples could enable many impactful applications from robotics to user personalization. Most few-shot learning research, however, has been driven by benchmark datasets that lack the high variation that these applications will face when deployed in the real-world. To close this gap, we present the ORBIT dataset and benchmark, grounded in the real-world application of teachable object recognizers for people who are blind/low-vision. The dataset contains 3,822 videos of 486 objects recorded by people who are blind/low-vision on their mobile phones. The benchmark reflects a realistic, highly challenging recognition problem, providing a rich playground to drive research in robustness to few-shot, high-variation conditions. We set the benchmarks first state-of-the-art and show there is massive scope for further innovation, holding the potential to impact a broad range of real-world vision applications including tools for the blind/low-vision community. We release the dataset at https://doi.org/10.25383/city.14294597 and benchmark code at https://github.com/microsoft/ORBIT-Dataset.

Computer Vision and Pattern Recognition

A Categorized Reflection Removal Dataset with Diverse Real-world Scenes

116 - Chenyang Lei , Xuhua Huang , Chenyang Qi 2021

Due to the lack of a large-scale reflection removal dataset with diverse real-world scenes, many existing reflection removal methods are trained on synthetic data plus a small amount of real-world data, which makes it difficult to evaluate the strengths or weaknesses of different reflection removal methods thoroughly. Furthermore, existing real-world benchmarks and datasets do not categorize image data based on the types and appearances of reflection (e.g., smoothness, intensity), making it hard to analyze reflection removal methods. Hence, we construct a new reflection removal dataset that is categorized, diverse, and real-world (CDR). A pipeline based on RAW data is used to capture perfectly aligned input images and transmission images. The dataset is constructed using diverse glass types under various environments to ensure diversity. By analyzing several reflection removal methods and conducting extensive experiments on our dataset, we show that state-of-the-art reflection removal methods generally perform well on blurry reflection but fail in obtaining satisfying performance on other types of real-world reflection. We believe our dataset can help develop novel methods to remove real-world reflection better. Our dataset is available at https://alexzhao-hugga.github.io/Real-World-Reflection-Removal/.

Computer Vision and Pattern Recognition