Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Efficient Urdu Caption Generation using Attention based LSTM

64 0 0.0 ( 0 )

Download Cite

Added by Inaam Ilahi

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Inaam Ilahi - Hafiz Muhammad Abdullah Zia - Muhammad Ahtazaz Ahsan

Computation and Language Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recent advancements in deep learning have created many opportunities to solve real-world problems that remained unsolved for more than a decade. Automatic caption generation is a major research field, and the research community has done a lot of work on it in most common languages like English. Urdu is the national language of Pakistan and also much spoken and understood in the sub-continent region of Pakistan-India, and yet no work has been done for Urdu language caption generation. Our research aims to fill this gap by developing an attention-based deep learning model using techniques of sequence modeling specialized for the Urdu language. We have prepared a dataset in the Urdu language by translating a subset of the Flickr8k dataset containing 700 man images. We evaluate our proposed technique on this dataset and show that it can achieve a BLEU score of 0.83 in the Urdu language. We improve on the previous state-of-the-art by using better CNN architectures and optimization techniques. Furthermore, we provide a discussion on how the generated captions can be made correct grammar-wise.

rate research

EL-Attention: Memory Efficient Lossless Attention for Generation

101 - Yu Yan , Jiusheng Chen , Weizhen Qi 2021

Transformer model with multi-head attention requires caching intermediate results for efficient inference in generation tasks. However, cache brings new memory-related costs and prevents leveraging larger batch size for faster speed. We propose memory-efficient lossless attention (called EL-attention) to address this issue. It avoids heavy operations for building multi-head keys and values, cache for them is not needed. EL-attention constructs an ensemble of attention results by expanding query while keeping key and value shared. It produces the same result as multi-head attention with less GPU memory and faster inference speed. We conduct extensive experiments on Transformer, BART, and GPT-2 for summarization and question generation tasks. The results show EL-attention speeds up existing models by 1.6x to 5.3x without accuracy loss.

Computation and Language Machine Learning

Sea Ice Forecasting using Attention-based Ensemble LSTM

549 - Sahara Ali , Yiyi Huang , Xin Huang 2021

Accurately forecasting Arctic sea ice from subseasonal to seasonal scales has been a major scientific effort with fundamental challenges at play. In addition to physics-based earth system models, researchers have been applying multiple statistical and machine learning models for sea ice forecasting. Looking at the potential of data-driven sea ice forecasting, we propose an attention-based Long Short Term Memory (LSTM) ensemble method to predict monthly sea ice extent up to 1 month ahead. Using daily and monthly satellite retrieved sea ice data from NSIDC and atmospheric and oceanic variables from ERA5 reanalysis product for 39 years, we show that our multi-temporal ensemble method outperforms several baseline and recently proposed deep learning models. This will substantially improve our ability in predicting future Arctic sea ice changes, which is fundamental for forecasting transporting routes, resource development, coastal erosion, threats to Arctic coastal communities and wildlife.

Atmospheric and Oceanic Physics Artificial Intelligence Machine Learning

Earlier Attention? Aspect-Aware LSTM for Aspect-Based Sentiment Analysis

117 - Bowen Xing , Lejian Liao , Dandan Song 2019

Aspect-based sentiment analysis (ABSA) aims to predict fine-grained sentiments of comments with respect to given aspect terms or categories. In previous ABSA methods, the importance of aspect has been realized and verified. Most existing LSTM-based models take aspect into account via the attention mechanism, where the attention weights are calculated after the context is modeled in the form of contextual vectors. However, aspect-related information may be already discarded and aspect-irrelevant information may be retained in classic LSTM cells in the context modeling process, which can be improved to generate more effective context representations. This paper proposes a novel variant of LSTM, termed as aspect-aware LSTM (AA-LSTM), which incorporates aspect information into LSTM cells in the context modeling stage before the attention mechanism. Therefore, our AA-LSTM can dynamically produce aspect-aware contextual representations. We experiment with several representative LSTM-based models by replacing the classic LSTM cells with the AA-LSTM cells. Experimental results on SemEval-2014 Datasets demonstrate the effectiveness of AA-LSTM.

Computation and Language

Fast Multi-language LSTM-based Online Handwriting Recognition

134 - Victor Carbune , Pedro Gonnet , Thomas Deselaers 2019

We describe an online handwriting system that is able to support 102 languages using a deep neural network architecture. This new system has completely replaced our previous Segment-and-Decode-based system and reduced the error rate by 20%-40% relative for most languages. Further, we report new state-of-the-art results on IAM-OnDB for both the open and closed dataset setting. The system combines methods from sequence recognition with a new input encoding using Bezier curves. This leads to up to 10x faster recognition times compared to our previous system. Through a series of experiments we determine the optimal configuration of our models and report the results of our setup on a number of additional public datasets.

Computation and Language Machine Learning Machine Learning

Clue: Cross-modal Coherence Modeling for Caption Generation

156 - Malihe Alikhani , Piyush Sharma , Shengjie Li 2020

We use coherence relations inspired by computational models of discourse to study the information needs and goals of image captioning. Using an annotation protocol specifically devised for capturing image--caption coherence relations, we annotate 10,000 instances from publicly-available image--caption pairs. We introduce a new task for learning inferences in imagery and text, coherence relation prediction, and show that these coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models. The results show a dramatic improvement in the consistency and quality of the generated captions with respect to information needs specified via coherence relations.

Computation and Language Computer Vision and Pattern Recognition

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Efficient Urdu Caption Generation using Attention based LSTM

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions