Loading

Performance of Isolated and Continuous Digit Recognition System using Kaldi Toolkit
Mahadevaswamy1, D J Ravi2

1Mahadevaswamy, Research Scholar, Department of Electronics and Communication, Vidyavardhaka College of Engineering, Mysuru, Visvesvaraya Technological University, Belagavi (Karnataka), India.
2D J Ravi, Dean Academics & Professor, Department of Electronics & Communication, Vidyavardhaka College of Engineering, Mysuru, Visvesvaraya Technological University, Belagavi (Karnataka), India.
Manuscript received on 27 June 2019 | Revised Manuscript received on 15 July 2019 | Manuscript Published on 26 July 2019 | PP: 264-271 | Volume-8 Issue-2S2 July 2019 | Retrieval Number: B10470782S219/2019©BEIESP | DOI: 10.35940/ijrte.B1047.0782S219
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: A digit recognition system is built for recognizing the sequence of digits through 0-9. The system is experimented with speech corpus created in the room environment. The acoustic information to feature representation is achieved using PLP and MFCC features. The system initially utilized the conventional GMM-HMM framework, state of the art hybrid classifier with varied number of states to complete the speech recognition task, i.e., the system is first trained and tested using Monophone models, and system’s recognition accuracy is then evaluated using Triphone Models: Triphone1 models, which was later followed by Triphones2 models and Triphones3 Models. The Ngram Language model is used for both Monophone and Triphone training. The system performance is evaluated with the use of MFCC and PLP parameterisation techniques on Kaldi toolkit. The system performance is evaluated using metrics word error rate (WER) and Word Recognition Accuracy (WRA). The proposed system can be utilized for building speech applications.
Keywords: PLP, MFCC, GMM-HMM and KALDI TOOLKIT.
Scope of the Article: Pattern Recognition