Text and Non Text Scene Image Classification for Visually Impaired Through Alexnet Transfer Learning Model
Anilkumar B1, Sreerama Murthy Velaga2, A Aswani Devi3
1B. Anilkumar, Department of Engineering and Communication Engineering, GMRIT, Rajam, India.
2Sreerama Murthy Velaga, Department of Computer Science Engineering, GMRIT, Rajam, India.
3A. Aswani Devi, Department of Computer Science Engineering, GMRIT, Rajam, India.
Manuscript received on 08 April 2019 | Revised Manuscript received on 16 May 2019 | Manuscript published on 30 May 2019 | PP: 1125-1129 | Volume-8 Issue-1, May 2019 | Retrieval Number: A2152058119/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Natural Scene Image based text Recognition is a prevalent and exciting research field in computer vision in recent years. For visually impaired people there are some assistive devices which make them to sense the scene images through text extraction. The first and crucial task of the assistive device for text extraction is to detect the text in scene images. This paper proposes a transfer learning based approach with pre-trained CNN model to classify the text and non-text images. AlexNet is the pre-trained architecture that is used as binary classifier. The first 5 convolution layers of the AlexNet are freezed. The last 3 layers are fully connected layers, in which the final output layer is modified to size 2, as this is the binary classifier. The images in the dataset must be preprocessed either before training or testing. The preprocessing consists of Denoising and Augmentation. Denoising removes the noise in the input image using Denoising Convolution Neural Network (DnCNN). Data Augmentation includes image resizing, because AlexNet only accepts the RBG images of size 256×256. The proposed model has achieved the accuracy of 99% in classifying the test dataset.
Index Terms: Scene Text Detection, Transfer Learning, Alexnet, Classification, DnCNN.
Scope of the Article: Classification