Loading

Wavenet Based Speech Recognition System on An Embedded Board
Aniruddha Sharma1, John Sahaya Rani Alex2

1Aniruddha Sharma, School of Electronics Engineering, Vellore Institute of Technology, Chennai (Tamil Nadu), India.
2John Sahaya Rani Alex, School of Electronics Engineering, Vellore Institute of Technology, Chennai (Tamil Nadu), India.
Manuscript received on 27 March 2019 | Revised Manuscript received on 08 April 2019 | Manuscript Published on 18 April 2019 | PP: 923-927 | Volume-7 Issue-6S March 2019 | Retrieval Number: F03880376S19/2019©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Speech Recognition is a vital and indispensable part of many modern-day systems and has subsequently become ubiquitous, finding diverse applications in fields such as automation, voice control, security, robotics etc. This paper aims to demonstrate implementation of an isolated spoken word recognition based on WAVENETs on an embedded board using an open-source numerical computing software called GNU Octave. WaveNet is an Artificial Neural Network (ANN) with wavelet function as an activation function. In this work, Gaussian wavelet is chosen as an activation function. The speech recognition involves the use of Mel-Frequency Cepstral Coefficients (MFCC) features which are calculated from the speech signal and fed as input to the NN. The Multi-Layer Perceptron (MLP) Feed Forward Neural Network is configured to process speech signal and is trained using back-propagation algorithm in MATLAB. The trained weights are then fed into and implemented using GNU Octave on Raspberry Pi. Texas Instruments’ TIDIGITS Corpus is used for training and testing the neural network.
Keywords: Mel Frequency, Neural Network, Raspberry Pi, Wavenet.
Scope of the Article: Pattern Recognition