Development of end – to – end Encoder – Decoder Model Applying Voice Recognition System in Different Channels
Syed Jamalullah R¹, L.Mary Gladence², V. Rajalakshmi³
¹Syed Jamalullah R, Research Scholar, Department of Computing, Sathyabama Institute of Science and Technology, Chennai (Tamil Nadu), India.
²L. Mary Gladence, Department of Computing, Sathyabama Institute of Science and Technology, Chennai (Tamil Nadu), India.
³V. Rajalakshmi, Department of Computing, Sathyabama Institute of Science and Technology, Chennai (Tamil Nadu), India.
Manuscript received on 15 October 2019 | Revised Manuscript received on 24 October 2019 | Manuscript Published on 02 November 2019 | PP: 2350-2352 | Volume-8 Issue-2S11 September 2019 | Retrieval Number: B12670982S1119/2019©BEIESP | DOI: 10.35940/ijrte.B1267.0982S1119
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The dissimilarity in recognizing the word sequence and their ground truth in different channels can be absorbed by implementing Automatic Speech Recognition which is the standard evaluation metric and is encountered with the phenomena of Word Error Rate for various measures. In the model of 1ch, the track is trained without any preprocessing and study on multichannel end-to-end Automatic Speech Recognition envisaged that the function can be integrated into (Deep Neural network) – based system and lead to multiple experimental results. More so, when the Word Error Rate (WER) is not directly differentiable, it is pertinent to adopt Encoder – Decoder gradient objective function which has been clear in CHiME-4 system. In this study, we examine that the sequence level evaluation metric is a fair choice for optimizing Encoder – Decoder model for which many training algorithms is designed to reduce sequence level error. The study incorporates the scoring of multiple hypotheses in decoding stage for improving the decoding result to optimum. By this, the mismatch between the objectives is resulted in a feasible form to the maxim. Hence, the study finds the result of voice recognition which is most effective for adaptation.
Keywords: Multichannel system, Word Error Rate (WER), Automatic Speech Recognition (ASR).
Scope of the Article: Pattern Recognition

Download PDF

JOURNAL

REQUIREMENTS

PRODUCT

CONTACT US