ﻻ يوجد ملخص باللغة العربية
In this work, we propose a novel hybrid method for scene text detection namely Correlation Propagation Network (CPN). It is an end-to-end trainable framework engined by advanced Convolutional Neural Networks. Our CPN predicts text objects according to both top-down observations and the bottom-up cues. Multiple candidate boxes are assembled by a spatial communication mechanism call Correlation Propagation (CP). The extracted spatial features by CNN are regarded as node features in a latticed graph and Correlation Propagation algorithm runs distributively on each node to update the hypothesis of corresponding object centers. The CP process can flexibly handle scale-varying and rotated text objects without using predefined bounding box templates. Benefit from its distributive nature, CPN is computationally efficient and enjoys a high level of parallelism. Moreover, we introduce deformable convolution to the backbone network to enhance the adaptability to long texts. The evaluation on public benchmarks shows that the proposed method achieves state-of-art performance, and it significantly outperforms the existing methods for handling multi-scale and multi-oriented text objects with much lower computation cost.
Large geometry (e.g., orientation) variances are the key challenges in the scene text detection. In this work, we first conduct experiments to investigate the capacity of networks for learning geometry variances on detecting scene texts, and find tha
A novel framework named Markov Clustering Network (MCN) is proposed for fast and robust scene text detection. MCN predicts instance-level bounding boxes by firstly converting an image into a Stochastic Flow Graph (SFG) and then performing Markov Clus
Numerous scene text detection methods have been proposed in recent years. Most of them declare they have achieved state-of-the-art performances. However, the performance comparison is unfair, due to lots of inconsistent settings (e.g., training data,
Scene text detection, which is one of the most popular topics in both academia and industry, can achieve remarkable performance with sufficient training data. However, the annotation costs of scene text detection are huge with traditional labeling me
Recent learning-based approaches show promising performance improvement for scene text removal task. However, these methods usually leave some remnants of text and obtain visually unpleasant results. In this work, we propose a novel end-to-end framew