Data Mining: Random Swapping based Data Perturbation Technique for Privacy Preserving in Data Mining
Ajmeera Kiran1, D. Vasumathi2
1Ajmeera Kiran, Department of Computer Science and Engineering, JNTUH College of Engineering, Hyderabad (Telangana), India.
2Dr. D Vasumathi, Department of CSE, JNTUH College of Engineering, Hyderabad (Telangana), India.
Manuscript received on 06 June 2019 | Revised Manuscript received on 30 June 2019 | Manuscript Published on 04 July 2019 | PP: 764-777 | Volume-8 Issue-1S4 June 2019 | Retrieval Number: A11420681S419/2019©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Data mining is a process of collecting unknown data from different data sources and such data are very much useful for various decision-making Processes. Data mining process utilizes such sensitive information for analyzing purpose but, privacy preservation of such sensitive data is very much important in every data mining applications. For Example, in-patient Health records some of the sensitive attributes like PID, Age, and Disease Name should not be disclosed to the third party which will lead to privacy violation of the individuals. Hence, a new model should be designed to preserve the privacy of such Sensitive data before it make publicly available. In this paper, an accurate and efficient PPDM (Privacy Preserving Data Mining) technique is implemented in order to preserve the private information about individuals. In the Existing System, traditional Geometric data perturbation (Gaussian Noise Based) technique preserved the individual privacy with some information loss. In the Proposed Paper, an efficient and effective Random Swapping based data perturbation technique is proposed which is mainly focuses on preserving the sensitive attributes and also attaining accurate classification results with minimum Information loss. In Proposed Framework, the accuracy, error rates is compared with a Naïve Bayes classification algorithm and J48 decision tree Algorithm and results are analysed using Weka 3.8 tool. Proposed Random Swapping Based perturbation technique improved the Accuracy and reduced the error rates with minimum information loss with compared to the existing system.
Keywords: Data Mining, Privacy Preserving, Data Perturbation, Random Swapping, Naive Bayes Classification, J48 Decision Tree, Geometric Data Perturbation.
Scope of the Article: Data Mining