A Different Text Mining Process for Classifying Journal Databases Using Machine Learning Algorithms
G. Thailambal1, Ananthi Sheshasaayee2
1Dr. G. Thailambal, Associate Professor, Department of Computer Science, Vels Institute of Science, Technology and Advanced Studies, Chennai (Tamil Nadu), India.
2Dr. Ananthi Sheshasaayee, Associate Professor & Head, PG & Research, Department of Computer Science, Quaid-E-Millath Government College for Women Autonomous, Chennai (Tamil Nadu), India.
Manuscript received on 10 October 2019 | Revised Manuscript received on 19 October 2019 | Manuscript Published on 02 November 2019 | PP: 239-243 | Volume-8 Issue-2S11 September 2019 | Retrieval Number: B10390982S1119/2019©BEIESP | DOI: 10.35940/ijrte.B1039.0982S1119
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Google is the information repository for the entire world and is an important Search engine used for Information Retrieval. Accessing web pages is getting increased everyday which can be compared to the speed in which light travels. Biggest Challenge is identifying the user interest and providing them information based on the high relevancy. Mostly researchers search journal documents for their research every day. Classifying the content as papers or Slides or thesis is very difficult as the words used in these documents are not semantically checked. To mine the correct content in web page Data Mining is used by most of the researchers. Text Mining is one of its application. Text mining in nutshell is extracting useful information from unstructured data. The proposed Model Author Keyword Weightage in Journal Ranking (AKWJR) is developed to retrieve relevant journals that will help the researchers to identify the relevant documents from the pool of irrelevant documents. In many keyword ranking applications such as RAKE and TEXTRANK author annotated keywords were compared and used for ranking. The assignment of keywords to article by the author is different in their form and perspective. Though they were not choosing the keywords in a controlled vocabulary the keywords were used to describe their own content in the article. Two algorithms were used to arrange the keywords according to topics and the keywords inside the journals will be scored depending on its presence in various fields in the article. Depending on the score the journals will be ranked in such a way that the author can decide whether to open the article for their requirement. This is achieved through Latent Dirichlet Allocation, RankSVM and TF-IDF Algorithms.
Keywords: Machine Learning Algorithms Text Mining Process Data Classifying.
Scope of the Article: Machine Learning