Information Extraction using Tokenization and Clustering Methods
Jincymol Joseph1, J R Jeba2
1Jincymol Joseph, Assistant Professor, Department of Computer Science, St.Pius X college Rajapuram, Kasargod, Kerla.
2Dr.J R Jeba, Associate Professor & HOD, Department of Computer Applications Noorul Islam Centre of Higher Education, Kumaracoil, Tamil Nadu, India.
Manuscript received on November 20, 2019. | Revised Manuscript received on November 26, 2019. | Manuscript published on 30 November, 2019. | PP: 3690-3692 | Volume-8 Issue-4, November 2019. | Retrieval Number: D7943118419/2019©BEIESP | DOI: 10.35940/ijrte.D7943.118419
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: World Wide Web has become a powerful platform for retrieval and storage of information. It is a collection of text, image and multimedia files in structured, semi structured and unstructured form. These tremendous volumes of information cannot be processed so simply. An efficient and useful algorithm is required to discover information from these data. Text mining is a method for extracting meaningful information from large volume of data. Unstructured text is easily processed by humans but it is harder for machines. Text mining task involve methods such as tokenization, feature extraction and clustering.
Keywords: Tokenization, Feature Extraction, Cluster, Stemming.
Scope of the Article: Cloud, Cluster, Grid and P2P Computing.