Parallel Computation Performingkernel-Based Clustering Algorithm Using Particle Swarm Optimization for the Big Data Analytics
E. Laxmi Lydia1, B. PRASAD2, Gogineni HimaBindu3, K. Shankar4, K. Vijaya Kumar5
1E. Laxmi Lydia, Associate Professor, Department of Computer Science Engineering, Vignan’s Institute of Information Technology, Andhra Pradesh, India.
2Dr. B. PRASAD, Professor, Department of Computer Science Engineering, Vignan’s Institute of Information Technology, Andhra Pradesh, India.
3Gogineni HimaBindu, Asst. Prof., MCA, Vignan’s Institute of Information Technology (Autonomous), Visakhapatnam, India.
4K. Shankar, Assistant Professor, School of Computing, Kalasalingam University, Krishnankoil, – 626126, (Tamil Nadu), India.
5K. Vijaya Kumar, Associate Professor, Department of Computer Science Engineering, Vignan’s Institute of Information Technology for Women, Andhra Pradesh, India.
Manuscript received on 03 March 2019 | Revised Manuscript received on 08 March 2019 | Manuscript published on 30 July 2019 | PP: 4753-4756 | Volume-8 Issue-2, July 2019 | Retrieval Number: B1740078219/19©BEIESP | DOI: 10.35940/ijrte.B1740.078219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Digital data has been accelerating day by day with a bulk of dimensions. Analysis of such an immense quantity of data popularly termed as big data, which requires tremendous data analysis scalable techniques. Clustering is an appropriate tool for data analysis to observe hidden similar groups inside the data. Clustering distinct datasets involve both Linear Separable and Non-Linear Separable clustering algorithms by defining and measuring their inter-point similarities as well as non-linear similarity measures. Problem Statement: Yet there are many productive clustering algorithms to cluster linearly; they do not maintain quality clusters. Kernel -based algorithms make use of non-linear similarity measures to define similarity while forming clusters specifically with arbitrary shapes and frequencies.
Existing System: Current Kernel-based clustering algorithms have few restraints concerning complexity, memory, and performance. Time and Memory will increase equally when the size of the dataset increase. It is challenging to elect kernel similarity function for different datasets. We have classical random sampling and low-rank matrix approximation linear clustering algorithms with high cluster quality and low memory essentials. Proposed work: in our research, we have introduced a parallel computation performing Kernel-based clustering algorithm using Particle Swarm Optimization approach. This methodology can cluster large datasets having maximum dimensional values accurately and overcomes the issues of high dimensional datasets.
Keywords: Big Data Analytics Separable clustering algorithms
Scope of the Article: Big Data Analytics