ترغب بنشر مسار تعليمي؟ اضغط هنا

Large Vocabulary Arabic Online Handwriting Recognition System

163   0   0.0 ( 0 )
 نشر من قبل Ibrahim Abdelaziz
 تاريخ النشر 2014
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Arabic handwriting is a consonantal and cursive writing. The analysis of Arabic script is further complicated due to obligatory dots/strokes that are placed above or below most letters and usually written delayed in order. Due to ambiguities and diversities of writing styles, recognition systems are generally based on a set of possible words called lexicon. When the lexicon is small, recognition accuracy is more important as the recognition time is minimal. On the other hand, recognition speed as well as the accuracy are both critical when handling large lexicons. Arabic is rich in morphology and syntax which makes its lexicon large. Therefore, a practical online handwriting recognition system should be able to handle a large lexicon with reasonable performance in terms of both accuracy and time. In this paper, we introduce a fully-fledged Hidden Markov Model (HMM) based system for Arabic online handwriting recognition that provides solutions for most of the difficulties inherent in recognizing the Arabic script. A new preprocessing technique for handling the delayed strokes is introduced. We use advanced modeling techniques for building our recognition system from the training data to provide more detailed representation for the differences between the writing units, minimize the variances between writers in the training data and have a better representation for the features space. System results are enhanced using an additional post-processing step with a higher order language model and cross-word HMM models. The system performance is evaluated using two different databases covering small and large lexicons. Our system outperforms the state-of-art systems for the small lexicon database. Furthermore, it shows promising results (accuracy and time) when supporting large lexicon with the possibility for adapting the models for specific writers to get even better results.



قيم البحث

اقرأ أيضاً

Arabic is a semitic language characterized by a complex and rich morphology. The exceptional degree of ambiguity in the writing system, the rich morphology, and the highly complex word formation process of roots and patterns all contribute to making computational approaches to Arabic very challenging. As a result, a practical handwriting recognition system should support large vocabulary to provide a high coverage and use the context information for disambiguation. Several research efforts have been devoted for building online Arabic handwriting recognition systems. Most of these methods are either using their small private test data sets or a standard database with limited lexicon and coverage. A large scale handwriting database is an essential resource that can advance the research of online handwriting recognition. Currently, there is no online Arabic handwriting database with large lexicon, high coverage, large number of writers and training/testing data. In this paper, we introduce AltecOnDB, a large scale online Arabic handwriting database. AltecOnDB has 98% coverage of all the possible PAWS of the Arabic language. The collected samples are complete sentences that include digits and punctuation marks. The collected data is available on sentence, word and character levels, hence, high-level linguistic models can be used for performance improvements. Data is collected from more than 1000 writers with different backgrounds, genders and ages. Annotation and verification tools are developed to facilitate the annotation and verification phases. We built an elementary recognition system to test our database and show the existing difficulties when handling a large vocabulary and dealing with large amounts of styles variations in the collected data.
In the recent years it turned out that multidimensional recurrent neural networks (MDRNN) perform very well for offline handwriting recognition tasks like the OpenHaRT 2013 evaluation DIR. With suitable writing preprocessing and dictionary lookup, ou r ARGUS software completed this task with an error rate of 26.27% in its primary setup.
This paper introduces an agent-centric approach to handle novelty in the visual recognition domain of handwriting recognition (HWR). An ideal transcription agent would rival or surpass human perception, being able to recognize known and new character s in an image, and detect any stylistic changes that may occur within or across documents. A key confound is the presence of novelty, which has continued to stymie even the best machine learning-based algorithms for these tasks. In handwritten documents, novelty can be a change in writer, character attributes, writing attributes, or overall document appearance, among other things. Instead of looking at each aspect independently, we suggest that an integrated agent that can process known characters and novelties simultaneously is a better strategy. This paper formalizes the domain of handwriting recognition with novelty, describes a baseline agent, introduces an evaluation protocol with benchmark data, and provides experimentation to set the state-of-the-art. Results show feasibility for the agent-centric approach, but more work is needed to approach human-levels of reading ability, giving the HWR community a formal basis to build upon as they solve this challenging problem.
We attempt to overcome the restriction of requiring a writing surface for handwriting recognition. In this study, we design a prototype of a stylus equipped with motion sensor, and utilizes gyroscopic and acceleration sensor reading to perform writte n letter classification using various deep learning techniques such as CNN and RNNs. We also explore various data augmentation techniques and their effects, reaching up to 86% accuracy.
Several approaches have been proposed in recent literature to alleviate the long-tail problem, mainly in object classification tasks. In this paper, we make the first large-scale study concerning the task of Long-Tail Visual Relationship Recognition (LTVRR). LTVRR aims at improving the learning of structured visual relationships that come from the long-tail (e.g., rabbit grazing on grass). In this setup, the subject, relation, and object classes each follow a long-tail distribution. To begin our study and make a future benchmark for the community, we introduce two LTVRR-related benchmarks, dubbed VG8K-LT and GQA-LT, built upon the widely used Visual Genome and GQA datasets. We use these benchmarks to study the performance of several state-of-the-art long-tail models on the LTVRR setup. Lastly, we propose a visiolinguistic hubless (VilHub) loss and a Mixup augmentation technique adapted to LTVRR setup, dubbed as RelMix. Both VilHub and RelMix can be easily integrated on top of existing models and despite being simple, our results show that they can remarkably improve the performance, especially on tail classes. Benchmarks, code, and models have been made available at: https://github.com/Vision-CAIR/LTVRR.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا