Missing Value Aware Optimal Feature Selection Method for Efficient Big Data Mining Process
S. Meera1, B. Rosiline Jeetha2
1S. Meera, Assistant Professor PG, Department of Computer Science, PSGR Krishnammal College for Women, Coimbatore (Tamil Nadu), India.
2Dr. B. Rosiline Jeetha, HOD Professor, Department of Computer Science, Dr. N.G.P College of Arts and Science, Coimbatore (Tamil Nadu), India.
Manuscript received on 10 October 2019 | Revised Manuscript received on 19 October 2019 | Manuscript Published on 02 November 2019 | PP: 354-360 | Volume-8 Issue-2S11 September 2019 | Retrieval Number: B10550982S1119/2019©BEIESP | DOI: 10.35940/ijrte.B1055.0982S1119
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Big mining plays a more critical role in the real world environment due to presence of large volume of data with different varieties and type. Handling these data values and predicting the information would be the more difficult task which needs to be concerned more to obtain the useful knowledge. This is achieved in our previous research work by introducing the Enhanced Particle Swarm Optimization with Genetic Algorithm – Modified Artificial Neural Network (EPSOGA -MANN) which can select the optimal features from the big volume of data. However this research work might be reduced in its performance due to presence of missing values in the dataset. And also this method is more complex to perform due to increased computational overhead of ANN algorithm. This is resolved in the proposed research method by introducing the method namely Missing Value concerned Optimal Feature Selection Method (MV-OFSM). In this research method Improved KNN imputation algorithm is introduced to handle the missing values. And then Dynamic clustering method is introduced to cluster the dataset based on closeness measure. Then Anarchies Society Optimization (ASO) based feature selection approach is applied for performing feature selection in the given dataset. Finally a Hybrid ANN-GA classification technique is applied for implementing the classification. The overall performance evaluation of the research method is performed in the matlab simulation environment from which it is proved that the proposed research method leads to provide the better performance than the existing research technique.
Keywords: Feature Selection, Missing Value Handling, Preprocessing, Dynamic Clustering, Closeness Measure.
Scope of the Article: Big Data Networking