Text Document Categorization using Modified K-Means Clustering Algorithm
Sheelesh Kumar Sharma1, Navel Kishor Sharma2
1Dr. Sheelesh Kumar Sharma, Professor, Department of IT, Comp. Sc., IMS Ghaziabad (U.P), India.
2Mr. Navel Kishor Sharma, Associate Dean, Academic City College Ghana.
Manuscript received on 05 August 2019 | Revised Manuscript received on 28 August 2019 | Manuscript Published on 05 September 2019 | PP: 508-511 | Volume-8 Issue-2S7 July 2019 | Retrieval Number: B10950782S719/2019©BEIESP | DOI: 10.35940/ijrte.B1095.0782S719
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The volume of the information that is to be managed is increasing at exponential pace. The challenge arises how to manage this large data effectively. There are many parameters on which the performance of such a system can be measured such as time to retrieve the data, similarity of documents placed in same cluster etc. The paper presents an approach for auto-document categorization using a modified k-means. The proposed methodology has been tested on three different data sets. Experimental findings suggest that proposed methodology is accurate and robust for creating accurate clusters of documents. The proposed methodology uses cosine similarity measure and a fuzzy k-means clustering approach to yield the results very fast and accurately.
Keywords: K-Means, Text Mining, Web Mining.
Scope of the Article: Web and Text Mining