Performance Tuning of K-Mean Clustering Algorithm a Step towards Efficient DSS

Ayman E. Khedr; Ahmed I. El Seddawy; Amira M. Idrees

Abstract

This research is the first step in building an efficient Decision Support System (DSS) which employs Data Mining (DM) predictive, classification, clustering, and association rules techniques. This step considers finding groups of members in the dataset that are very different from each other, and whose members are very similar to each other, therefore one DM task is applied which is clustering task. The main objective of the proposed research is to enhance the performance of one of the most well-known popular clustering algorithms (K-mean) to produce near-optimal decisions for telcos churn prediction and retention problems. Due to its performance in clustering massive data sets. The final clustering result of the k-mean clustering algorithm greatly depends upon the correctness of the initial centroids, which are selected randomly. This research will be followed by a serious of researches targeting the main objective which is an efficient DSS which will be applied on customer banking data. In this research a new method is proposed for finding the better initial centroids to provide an efficient way of assigning the data points to suitable clusters with reduced time complexity. The proposed algorithm is successfully developed an applied on customer banking data, and the evaluation results are presented.

Keywords

Data Mining, Classification, K-Mean, Business Information, Data Envelopment Analysis, Artificial Neural Network, Rough set Theory

References

[1] A. Hunter and S. Parsons, "A review of uncertainty handling formalisms", Applications of Uncertainty Formalisms, LNAI 1455, pp.8-37. Springer -Verlag, 1998.

[2] E. Hernandez and J. Recasens, "A general framework for induction of decision trees under uncertainty", Modelling with Words, LNAI 2873, pp.26–43, Springer-Verlag, 2003.

[3] M. S. Chen, J. Han, and P. S. Yu. IEEE Trans Knowledge and Data Engineering Data mining. An overview from a database perspective, 8:866-883, 1996.

[4] U. Fayyad, G. Piatetsky-Shapiro and W. J. Frawley. AAAI/MIT, Press definition of KDD at KDD96. Knowledge Discovery in Databases, 1991.

[5] Gartner. Evolution of data mining, Gartner Group Advanced Technologies and Applications Research Note, 2/1/95.

[6] International Conferences on Knowledge Discovery in Databases and Data Mining (KDD’95-98), 1995-1998.

[7] R.J. Miller and Y. Yang. Association rules over interval data. SIGMOD'97, 452-461, Tucson, Arizona, 1997.

[8] Zaki, M.J., SPADE An Efficient Algorithm for Mining Frequent Sequences Machine Learning, 42(1) 31-60, 2001.

[9] Osmar R. Zaïane. “Principles of Knowledge Discovery in Databases - Chapter 8 Data Clustering”. & Shantanu Godbole data mining Data mining Workshop 9th November 2003.

[10] T.Imielinski and H. Mannila. Communications of ACM. A database perspective on knowledge discovery, 39:58-64, 1996.

[11] BIRCH Zhang, T., Ramakrishnan, R., and Livny, M. SIGMOD '96. BIRCH an efficient data clustering method for very large databases. 1996.

[12] A. M. Fahim, A. M. Salem, F. A. Torkey and M. A. Ramadan, “An Efficient enhanced K-Means clustering algorithm”, journal of Zhejiang University, 10 (7): 1626 - 1633, 2006.

[13] Chen Zhang and Shixiong Xia, “ K-Means Clustering Algorithm with Improved Initial center,” in Second International Workshop on Knowledge Discovery and Data Mining (WKDD), pp. 790-792, 2009.

[14] F. Yuan, Z. H. Meng, H. X. Zhangz, C. R. Dong, “ A New Algorithm to Get the Initial Centroids”, proceedings of the 3rd International Conference on Machine Learning and Cybernetics, pp. 26-29, August 2004.

[15] Chaturvedi J. C. A, Green P, “K - Modes clustering,” Journals of Classification, (18):35–55, 2001.

[16] Fahim A.M, Salem A. M, Torkey A and Ramadan M. A, “An Efficient enhanced K-Means clustering algorithm”, Journal of Zhejiang University, 10(7):1626–1633, 2006.

Cites this article as

A. E. Khedr, A. I. E. Seddawy, A. M. Idrees, "Performance Tuning of K-Mean Clustering Algorithm a Step towards Efficient DSS ", International Journal of Innovative Research in Computer Science and Technology (IJIRCST), Vol-2, no.6, pp.111-118, 2014. Available from:

Corresponding Author

Ayman E. Khedr

I.S Department Helwan University, Egypt e-mail- Ayman_Khedr@helwan.edu.eg

Download Full Paper

Download PDF

No. of Downloads: 22 | No. of Views: 2938

IJIRCST

Performance Tuning of K-Mean Clustering Algorithm a Step towards Efficient DSS

Citations

Download Full Paper PDF

Total View 2938

Total Download 22