ﻻ يوجد ملخص باللغة العربية
Deep learning-based scene text detection methods have progressed substantially over the past years. However, there remain several problems to be solved. Generally, long curve text instances tend to be fragmented because of the limited receptive field size of CNN. Besides, simple representations using rectangle or quadrangle bounding boxes fall short when dealing with more challenging arbitrary-shaped texts. In addition, the scale of text instances varies greatly which leads to the difficulty of accurate prediction through a single segmentation network. To address these problems, we innovatively propose a two-stage segmentation based arbitrary text detector named textit{NASK} (textbf{N}eed textbf{A} textbf{S}econd lootextbf{K}). Specifically, textit{NASK} consists of a Text Instance Segmentation network namely textit{TIS} ((1^{st}) stage), a Text RoI Pooling module and a Fiducial pOint eXpression module termed as textit{FOX} ((2^{nd}) stage). Firstly, textit{TIS} conducts instance segmentation to obtain rectangle text proposals with a proposed Group Spatial and Channel Attention module (textit{GSCA}) to augment the feature expression. Then, Text RoI Pooling transforms these rectangles to the fixed size. Finally, textit{FOX} is introduced to reconstruct text instances with a more tighter representation using the predicted geometrical attributes including text center line, text line orientation, character scale and character orientation. Experimental results on two public benchmarks including textit{Total-Text} and textit{SCUT-CTW1500} have demonstrated that the proposed textit{NASK} achieves state-of-the-art results.
Arbitrary-shaped text detection is a challenging task since curved texts in the wild are of the complex geometric layouts. Existing mainstream methods follow the instance segmentation pipeline to obtain the text regions. However, arbitraryshaped text
Recently, end-to-end text spotting that aims to detect and recognize text from cluttered images simultaneously has received particularly growing interest in computer vision. Different from the existing approaches that formulate text detection as boun
Due to the large success in object detection and instance segmentation, Mask R-CNN attracts great attention and is widely adopted as a strong baseline for arbitrary-shaped scene text detection and spotting. However, two issues remain to be settled. T
Region proposal mechanisms are essential for existing deep learning approaches to object detection in images. Although they can generally achieve a good detection performance under normal circumstances, their recall in a scene with extreme cases is u
With the spread of social networks and their unfortunate use for hate speech, automatic detection of the latter has become a pressing problem. In this paper, we reproduce seven state-of-the-art hate speech detection models from prior work, and show t