Do you want to publish a course? Click here

Task-Driven Super Resolution: Object Detection in Low-resolution Images

185   0   0.0 ( 0 )
 Added by Muhammad Haris
 Publication date 2018
and research's language is English




Ask ChatGPT about the research

We consider how image super resolution (SR) can contribute to an object detection task in low-resolution images. Intuitively, SR gives a positive impact on the object detection task. While several previous works demonstrated that this intuition is correct, SR and detector are optimized independently in these works. This paper proposes a novel framework to train a deep neural network where the SR sub-network explicitly incorporates a detection loss in its training objective, via a tradeoff with a traditional detection loss. This end-to-end training procedure allows us to train SR preprocessing for any differentiable detector. We demonstrate that our task-driven SR consistently and significantly improves accuracy of an object detector on low-resolution images for a variety of conditions and scaling factors.



rate research

Read More

Object classification is a significant task in computer vision. It has become an effective research area as an important aspect of image processing and the building block of image localization, detection, and scene parsing. Object classification from low-quality images is difficult for the variance of object colors, aspect ratios, and cluttered backgrounds. The field of object classification has seen remarkable advancements, with the development of deep convolutional neural networks (DCNNs). Deep neural networks have been demonstrated as very powerful systems for facing the challenge of object classification from high-resolution images, but deploying such object classification networks on the embedded device remains challenging due to the high computational and memory requirements. Using high-quality images often causes high computational and memory complexity, whereas low-quality images can solve this issue. Hence, in this paper, we investigate an optimal architecture that accurately classifies low-quality images using DCNNs architectures. To validate different baselines on lowquality images, we perform experiments using webcam captured image datasets of 10 different objects. In this research work, we evaluate the proposed architecture by implementing popular CNN architectures. The experimental results validate that the MobileNet architecture delivers better than most of the available CNN architectures for low-resolution webcam image datasets.
95 - Kai Sun 2021
Background. Digital pathology has aroused widespread interest in modern pathology. The key of digitalization is to scan the whole slide image (WSI) at high magnification. The lager the magnification is, the richer details WSI will provide, but the scanning time is longer and the file size of obtained is larger. Methods. We design a strategy to scan slides with low resolution (5X) and a super-resolution method is proposed to restore the image details when in diagnosis. The method is based on a multi-scale generative adversarial network, which sequentially generates three high-resolution images such as 10X, 20X and 40X. Results. The peak-signal-to-noise-ratio of 10X to 40X generated images are 24.16, 22.27 and 20.44, and the structural-similarity-index are 0.845, 0.680 and 0.512, which are better than other super-resolution networks. Visual scoring average and standard deviation from three pathologists is 3.63 plus-minus 0.52, 3.70 plus-minus 0.57 and 3.74 plus-minus 0.56 and the p value of analysis of variance is 0.37, indicating that generated images include sufficient information for diagnosis. The average value of Kappa test is 0.99, meaning the diagnosis of generated images is highly consistent with that of the real images. Conclusion. This proposed method can generate high-quality 10X, 20X, 40X images from 5X images at the same time, in which the time and storage costs of digitalization can be effectively reduced up to 1/64 of the previous costs. The proposed method provides a better alternative for low-cost storage, faster image share of digital pathology. Keywords. Digital pathology; Super-resolution; Low resolution scanning; Low cost
Deep neural network based methods have made a significant breakthrough in salient object detection. However, they are typically limited to input images with low resolutions ($400times400$ pixels or less). Little effort has been made to train deep neural networks to directly handle salient object detection in very high-resolution images. This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD). To our best knowledge, HRSOD is the first high-resolution saliency detection dataset to date. As another contribution, we also propose a novel approach, which incorporates both global semantic information and local high-resolution details, to address this challenging task. More specifically, our approach consists of a Global Semantic Network (GSN), a Local Refinement Network (LRN) and a Global-Local Fusion Network (GLFN). GSN extracts the global semantic information based on down-sampled entire image. Guided by the results of GSN, LRN focuses on some local regions and progressively produces high-resolution predictions. GLFN is further proposed to enforce spatial consistency and boost performance. Experiments illustrate that our method outperforms existing state-of-the-art methods on high-resolution saliency datasets by a large margin, and achieves comparable or even better performance than them on widely-used saliency benchmarks. The HRSOD dataset is available at https://github.com/yi94code/HRSOD.
Object detection in Ultra High-Resolution (UHR) images has long been a challenging problem in computer vision due to the varying scales of the targeted objects. When it comes to barcode detection, resizing UHR input images to smaller sizes often leads to the loss of pertinent information, while processing them directly is highly inefficient and computationally expensive. In this paper, we propose using semantic segmentation to achieve a fast and accurate detection of barcodes of various scales in UHR images. Our pipeline involves a modified Region Proposal Network (RPN) on images of size greater than 10k$times$10k and a newly proposed Y-Net segmentation network, followed by a post-processing workflow for fitting a bounding box around each segmented barcode mask. The end-to-end system has a latency of 16 milliseconds, which is $2.5times$ faster than YOLOv4 and $5.9times$ faster than Mask R-CNN. In terms of accuracy, our method outperforms YOLOv4 and Mask R-CNN by a $mAP$ of 5.5% and 47.1% respectively, on a synthetic dataset. We have made available the generated synthetic barcode dataset and its code at http://www.github.com/viplabB/SBD/.
The accuracy of OCR is usually affected by the quality of the input document image and different kinds of marred document images hamper the OCR results. Among these scenarios, the low-resolution image is a common and challenging case. In this paper, we propose the cascaded networks for document image super-resolution. Our model is composed by the Detail-Preserving Networks with small magnification. The loss function with perceptual terms is designed to simultaneously preserve the original patterns and enhance the edge of the characters. These networks are trained with the same architecture and different parameters and then assembled into a pipeline model with a larger magnification. The low-resolution images can upscale gradually by passing through each Detail-Preserving Network until the final high-resolution images. Through extensive experiments on two scanning document image datasets, we demonstrate that the proposed approach outperforms recent state-of-the-art image super-resolution methods, and combining it with standard OCR system lead to signification improvements on the recognition results.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا