Identifying the Underlying Relationship Between Water Quality Parameters of the Groundwater Samples using Association and Clustering Algorithms in Coimbatore District
J. Jansi1, P. Jegathambal2, S. Devaraj Arumainayagam3 

1Mrs J. Jansi, Scholar, Department of Mathematics, Kottayam Institute of Technology and Science, Coimbatore, (Tamil Nadu), India.
2Dr. P. Jegathambal, Professor, Karunya University.
3Dr. S. Devaraj Arumainayagam, Associate Professor, Department of Statistics, Government Arts College, Coimbatore.

Manuscript received on 06 March 2019 | Revised Manuscript received on 13 March 2019 | Manuscript published on 30 July 2019 | PP: 177-185 | Volume-8 Issue-2, July 2019 | Retrieval Number: A1377058119/19©BEIESP | DOI: 10.35940/ijrte.A1377.078219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Water is a highly complex environmental system; its protection cannot be met by traditional methods. As a part of the process, it is mandatory to evaluate the parameters of ground water so as to pursue suitable treatment. These days’ data mining algorithms have been developed to handle various data-rich environmental problems. In data mining, several techniques such as complex non-linear science, soft computing techniques, clustering and association have been applied in the domain of ground water quality assessment and evaluation in and around Coimbatore District. In this work, the statistical cluster analysis methods and association rule mining techniques were used to identify the spatial distribution of different cluster of wells having similar characteristics and determine the relationship between different water quality variables. The water quality assessment in Coimbatore was done using 13 parameters, namely NO3-, TDS, Mg2+, Ca2+, Na+, Cl-, F-, SO42-, EC, pH and Hardness including location in different sites. The main objective of the present study is to assess the performance of various clustering algorithms of WEKA and identify the most suitable algorithm for clustering water quality samples. K-Mean algorithm and centroid method of Hierarchical clustering performed in the similar manner in clustering. In addition to that, this study focused on identifying the water quality parameters exceeding permissible limits that occur together (TDS, Mg2+, SO42-, EC, hardness) in the given samples using Association Algorithms. The performance and efficiency of different association algorithms like Apriori and Frequent Pattern Growth algorithm was evaluated by factors like support, confidence, lift, leverage and conviction values.
Index Terms: Apriori Algorithm, Association Rule Mining, Data Mining Algorithms, Density Based Clustering Algorithm, Hierarchical Clustering, K-means Algorithms, Weka Tools, etc.

Scope of the Article:
Data Mining