Classification of Text Documents using Adaptive Robust Classifier
E Chandra Blessie1, Deepa A2
1E Chandra Blessie, Department of Computer Applications, Nehru College Of Management, Coimbatore, (Tamil Nadu), India.
2Deepa A, Department of Computer Applications, Nehru College Of Mangement, Coimbatore, (Tamil Nadu), India.
Manuscript received on 23 March 2019 | Revised Manuscript received on 30 March 2019 | Manuscript published on 30 March 2019 | PP: 1482-1489 | Volume-7 Issue-6, March 2019 | Retrieval Number: F2662037619/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Classifying the documents by means of extracting the keywords has become an imperative direction of research in text mining. The important purpose of extracting the keywords is to exemplify the documents in a concise manner. The compactable exemplification of documents serves multiple applications in different ways. Classifying the documents regards to the keywords have becomes a major task. Most classifiers are suitable only for the dataset which hold the low number of documents. In this paper, adaptive robust classifier (ARC) is proposed to classify the documents in any size dataset with better accuracy. ARC is designed to segregate the documents dataset into multiple parts and perform classification in a random manner, where the existing classifiers perform classification in a sequential manner which leads to poor classification of documents. The existing classifiers were designed to fit only for a specific type of dataset either with specific size, where ARC is designed to fit for document dataset with any size. For evaluating the performance of classifiers, this research work has chosen ACM Document collection dataset, Reuters-21578, NBA Input document collection dataset of a B-School which holds 3506, 21578, and 1256 documents respectively. The results shows that ARC is having better performance in terms of Classification Accuracy and F-Measure, than baseline classifiers.
Keywords: Classification, Mining, Text, NBA,ACM, Reuters
Scope of the Article: Classification.