ﻻ يوجد ملخص باللغة العربية
In this paper, we develop a binary convolutional encoder-decoder network (B-CEDNet) for natural scene text processing (NSTP). It converts a text image to a class-distinguished salience map that reveals the categorical, spatial and morphological information of characters. The existing solutions are either memory consuming or run-time consuming that cannot be applied to real-time applications on resource-constrained devices such as advanced driver assistance systems. The developed network can process multiple regions containing characters by one-off forward operation, and is trained to have binary weights and binary feature maps, which lead to both remarkable inference run-time speedup and memory usage reduction. By training with over 200, 000 synthesis scene text images (size of $32times128$), it can achieve $90%$ and $91%$ pixel-wise accuracy on ICDAR-03 and ICDAR-13 datasets. It only consumes $4.59 ms$ inference run-time realized on GPU with a small network size of 2.14 MB, which is up to $8times$ faster and $96%$ smaller than it full-precision version.
Modern mobile neural networks with a reduced number of weights and parameters do a good job with image classification tasks, but even they may be too complex to be implemented in an FPGA for video processing tasks. The article proposes neural network
High angular resolution diffusion imaging (HARDI) demands a lager amount of data measurements compared to diffusion tensor imaging, restricting its use in practice. In this work, we explore a learning-based approach to reconstruct HARDI from a smalle
Recently, segmentation-based methods are quite popular in scene text detection, as the segmentation results can more accurately describe scene text of various shapes such as curve text. However, the post-processing of binarization is essential for se
We introduce a new top-down pipeline for scene text detection. We propose a novel Cascaded Convolutional Text Network (CCTN) that joints two customized convolutional networks for coarse-to-fine text localization. The CCTN fast detects text regions ro
In this paper, we address the problem of having characters with different scales in scene text recognition. We propose a novel scale aware feature encoder (SAFE) that is designed specifically for encoding characters with different scales. SAFE is com