Predicting Outliers and Ranking Web Documents Using Correlation Analysis
Raheemaa Khan1, Mohammed Saleem Irfan Ahmed2, Ahmad Alenezi3
1Raheemaa Khan, Research Scholar, Department of Computer Science, Bharathiar University, Coimbatore (Tamil Nadu), India.
2Mohammed Saleem Irfan Ahmed, Associate Professor, Department of Computer & Information Sciences, College of Science and Arts, Al Ula Branch, Taibah University, Al Ula, Madhina.
3Ahmad Alenezi, Assistant Professor, Department of Computer & Information Sciences, College of Science and Arts, AlUla Branch, Taibah, University, Al Ula, Madhina.
Manuscript received on 05 May 2019 | Revised Manuscript received on 17 May 2019 | Manuscript Published on 23 May 2019 | PP: 550-555 | Volume-7 Issue-6S5 April 2019 | Retrieval Number: F10970476S519/2019©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Web Content Mining plays a vital role in recent days as people rely on internet for each and every information. People download and upload the documents frequently in turn the data in the web grows tremendously. This loads the servers and consecutively there are large numbers of duplicate documents as a result retrieving the relevant document becomes more tedious. In this proposed work, an algorithm is developed, where the documents are preprocessed, term frequency is calculated, correlation coefficient is found, outliers are identified and the duplicate documents are eliminated and at last the ranking is done and given to the user. The proposed research work is Proportional Correlation coefficient algorithm. It is applied to eliminate the outliers and rank the web documents. To enhance the effectiveness of the search engine this processes is done for web content mining. Using this process the user obtains the efficient results. The experimental analysis provides better accuracy in detecting outliers by comparing the proposed algorithm along with the existing methods.
Keywords: Web Content Mining, Correlation Coefficient, Ranking, Outliers, Correlation.
Scope of the Article: Web Mining