Image Description using Attention Mechanism
Abhilash Pandurangan¹, Vignesh Prabhakar², Poovammal E³
¹Abhilash Pandurangan, Student, Department of Computer Science and Engineering, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Chennai (Tamil Nadu), India.
²Vignesh Prabhakar, Student, Department of Computer Science and Engineering, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Chennai (Tamil Nadu), India.
³Poovammal E, Professor, Department of Computer Science and Engineering, SRM Institute of Science and Technology SRM Nagar, Kattankulathur, Chennai (Tamil Nadu), India.
Manuscript received on 18 October 2019 | Revised Manuscript received on 25 October 2019 | Manuscript Published on 02 November 2019 | PP: 3290-3293 | Volume-8 Issue-2S11 September 2019 | Retrieval Number: B15550982S1119/2019©BEIESP | DOI: 10.35940/ijrte.B1555.0982S1119
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Image Description involves generating a textual description of images which is essential for the problem of image understanding. The variable and ambiguous nature of possible image descriptions make this task challenging. There are different approaches for automated image captioning which explain the image contents along with a complete understanding of the image, rather than just simply classifying it into a particular object type. However, learning image contexts from the text and generating image descriptions similar to human’s description requires to focus on important features of the image using attention mechanism. We provide an outline of the various recent works in image description models employing various attention mechanism. We present an analysis of the various approaches, datasets and evaluation metrics that are utilized for image description. We showcase a model using the encoder-decoder attention mechanism based on Flickr dataset and evaluate the performance using BLEU metrics.
Keywords: Image Caption Generation; Attention Mechanism; Encoder and Decoder; Deep Neural Networks.
Scope of the Article: Image analysis and Processing

Download PDF

JOURNAL

REQUIREMENTS

PRODUCT

CONTACT US