Image Description using Attention Mechanism
Abhilash Pandurangan1, Vignesh Prabhakar2, Poovammal E3

1Abhilash Pandurangan, Student, Department of Computer Science and Engineering, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Chennai (Tamil Nadu), India.
2Vignesh Prabhakar, Student, Department of Computer Science and Engineering, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Chennai (Tamil Nadu), India.
3Poovammal E, Professor, Department of Computer Science and Engineering, SRM Institute of Science and Technology SRM Nagar, Kattankulathur, Chennai (Tamil Nadu), India.
Manuscript received on 18 October 2019 | Revised Manuscript received on 25 October 2019 | Manuscript Published on 02 November 2019 | PP: 3290-3293 | Volume-8 Issue-2S11 September 2019 | Retrieval Number: B15550982S1119/2019©BEIESP | DOI: 10.35940/ijrte.B1555.0982S1119
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Image Description involves generating a textual description of images which is essential for the problem of image understanding. The variable and ambiguous nature of possible image descriptions make this task challenging. There are different approaches for automated image captioning which explain the image contents along with a complete understanding of the image, rather than just simply classifying it into a particular object type. However, learning image contexts from the text and generating image descriptions similar to human’s description requires to focus on important features of the image using attention mechanism. We provide an outline of the various recent works in image description models employing various attention mechanism. We present an analysis of the various approaches, datasets and evaluation metrics that are utilized for image description. We showcase a model using the encoder-decoder attention mechanism based on Flickr dataset and evaluate the performance using BLEU metrics.
Keywords: Image Caption Generation; Attention Mechanism; Encoder and Decoder; Deep Neural Networks.
Scope of the Article: Image analysis and Processing