ﻻ يوجد ملخص باللغة العربية
Text recognition is a popular topic for its broad applications. In this work, we excavate the implicit task, character counting within the traditional text recognition, without additional labor annotation cost. The implicit task plays as an auxiliary branch for complementing the sequential recognition. We design a two-branch reciprocal feature learning framework in order to adequately utilize the features from both the tasks. Through exploiting the complementary effect between explicit and implicit tasks, the feature is reliably enhanced. Extensive experiments on 7 benchmarks show the advantages of the proposed methods in both text recognition and the new-built character counting tasks. In addition, it is convenient yet effective to equip with variable networks and tasks. We offer abundant ablation studies, generalizing experiments with deeper understanding on the tasks. Code is available.
In this paper, we address the problem of having characters with different scales in scene text recognition. We propose a novel scale aware feature encoder (SAFE) that is designed specifically for encoding characters with different scales. SAFE is com
Text detection, the key technology for understanding scene text, has become an attractive research topic. For detecting various scene texts, researchers propose plenty of detectors with different advantages: detection-based models enjoy fast detectio
The pursuit of high performance on public benchmarks has been the driving force for research in scene text recognition, and notable progress has been achieved. However, a close investigation reveals a startling fact that the state-of-the-art methods
Scene text recognition has been an important, active research topic in computer vision for years. Previous approaches mainly consider text as 1D signals and cast scene text recognition as a sequence prediction problem, by feat of CTC or attention bas
In this work, we aim to address the 3D scene stylization problem - generating stylized images of the scene at arbitrary novel view angles. A straightforward solution is to combine existing novel view synthesis and image/video style transfer approache