Effective Search Engine Spam Classification
D. Saraswathi1, A. Krishna Kumar2
1Dr. D. Saraswathi, Department of Computer Science, PSG College of Arts & Science, Coimbatore (Tamil Nadu), India.
2Dr. A. Krishna Kumar, Department of Computer Science, PSG College of Arts & Science, Coimbatore (Tamil Nadu), India.
Manuscript received on 22 August 2019 | Revised Manuscript received on 11 September 2019 | Manuscript Published on 17 September 2019 | PP: 1541-1545 | Volume-8 Issue-2S8 August 2019 | Retrieval Number: B11000882S819/2019©BEIESP | DOI: 10.35940/ijrte.B1100.0882S819
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Search engine spam is formed by the spam creators for commercial gain. Spammers applied different strategies in web pages to display the first page of web search results. These strategies may avoid displaying good quality web pages in the top of search engine results page. Nowadays there are numerous devised algorithms available to identify search engine spam. Even though search engines are still affected by search engine spam. There is a necessity for search engine industry to filter search engine spam in the best way. The proposed study identifies spam in web search engine. Spammers try to use most popular search keywords, popular links and advertising keywords in web pages. This strategy helps to increase ranking to display the top of search results. The proposed method is used important features to detect spam pages which are classified using decision tree C4.5 classifier. This method produces better performance when compared with existing classification methods.
Keywords: Search Engine Spam, Classification, Spamdexing, Decision Tree, Popular Search Keywords, Popular Links, and Advertising Words.
Scope of the Article: Classification