ﻻ يوجد ملخص باللغة العربية
Many studies on (Offline) Handwritten Text Recognition (HTR) systems have focused on building state-of-the-art models for line recognition on small corpora. However, adding HTR capability to a large scale multilingual OCR system poses new challenges. This paper addresses three problems in building such systems: data, efficiency, and integration. Firstly, one of the biggest challenges is obtaining sufficient amounts of high quality training data. We address the problem by using online handwriting data collected for a large scale production online handwriting recognition system. We describe our image data generation pipeline and study how online data can be used to build HTR models. We show that the data improve the models significantly under the condition where only a small number of real images is available, which is usually the case for HTR models. It enables us to support a new script at substantially lower cost. Secondly, we propose a line recognition model based on neural networks without recurrent connections. The model achieves a comparable accuracy with LSTM-based models while allowing for better parallelism in training and inference. Finally, we present a simple way to integrate HTR models into an OCR system. These constitute a solution to bring HTR capability into a large scale OCR system.
Offline handwriting recognition with deep neural networks is usually limited to words or lines due to large computational costs. In this paper, a less computationally expensive full page offline handwritten text recognition framework is introduced. T
Convolutional Recurrent Neural Networks (CRNNs) excel at scene text recognition. Unfortunately, they are likely to suffer from vanishing/exploding gradient problems when processing long text images, which are commonly found in scanned documents. This
Recently, great success has been achieved in offline handwritten Chinese character recognition by using deep learning methods. Chinese characters are mainly logographic and consist of basic radicals, however, previous research mostly treated each Chi
Single online handwritten Chinese character recognition~(single OLHCCR) has achieved prominent performance. However, in real application scenarios, users always write multiple Chinese characters to form one complete sentence and the contextual inform
Optical character recognition (OCR) systems performance have improved significantly in the deep learning era. This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variati