Hybrid Modeling KMeans – Genetic Algorithms in the Health Care Data

  • Tessy Badriyah Electronic Engineering Polytechnic Institute of Surabaya Indonesia


K-Means is one of the major algorithms widely used in clustering due to its good computational performance. However, K-Means is very sensitive to the initially selected points which randomly selected, and therefore it does not always generate optimum solutions. Genetic algorithm approach can be applied to solve this problem. In this research we examine the potential of applying hybrid GA- KMeans with focus on the area of health care data. We proposed a new technique using hybrid method combining KMeans Clustering and Genetic Algorithms, called the “Hybrid K-Means Genetic Algorithms†(HKGA). HKGA combines the power of Genetic Algorithms and the efficiency of K-Means Clustering. We compare our results with other conventional algorithms and also with other published research as well. Our results demonstrate that the HKGA achieves very good results and in some cases superior to other methods.

Keywords: Machine Learning, K-Means, Genetic Algorithms, Hybrid KMeans Genetic Algorithm (HGKA).


Download data is not yet available.


Boncheva, V.M., Using the Agglomerative Method of Hierarchical as A Data Mining Tool in Capital Market, International Journal Information Theories & Applications, Vol.15, 2008.

Al-Shboul, B., Myaeng, S.H., Initializing K-Means Using Genetic Algorithms, World Academy of Science, Engineering and Technology, Vol. 54, 2009.

Delen, D. W.,Predicting Breast Cancer Survivability: A Comparison of Three Data Mining Methods,Artificial Intelligence in Medicine, 2004.

Quinlan, J.,Improved use of Continuous Attributes in C4.5,AI Research, pp. 77-90, 1996.

Adam B-L, Qu Y, Davis JW, Ward MD, Clements MA, Cazares LH et al.. Serum Protein Finger Printing Coupled with A Pattern-Matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men,Cancer Research, 2002.

Tjortjis,C., Saraee,M., Theodoulidis,B., Keane, J.A.,. Using T3, an Improved Decision Tree Classifier for Mining Stroke Related Medical Data, University of Manchester, 2005.

Guan W, Zhou M., Hampton C.Y., Benigno B.,Ovarian Cancer Detection from Metabolomic Liquid Chromatography/Mass Spectrometry Data by Support Vector Machine, BMC Bioinformatics, 2000.

Weinstein J, K. Kohn and M. Grever, et. al.. Neural Computing in Cancer Drug Development : Predicting Mechanism of Action,Information Science , pp. 447-451, 1992.

Dhiraj,K., Rath,S.K.,Gene Expression Analysis Using Clustering. International Journal of Computer and Electrical Engineering, 2009.

Cruz, J.A., David S. Wishart,Applications of Machine Learning in Cancer Prediction and Prognosis, Journal on Cancer Informatics, pp. 59-78, 2006.

Jiawei Han, M. K.,Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2000.

Holland, J. H., Adaptation in Natural and Artificial Systems, AnnArbor, MI: Univ. of Michigan Press, 1975.

De Jong, K.A., Evolutionary Computation: A Unified Approach, MIT Press, Cambridge, MA, 2006.

Chen, H.; Fuller, S.S.; Friedman, C.; Hersh, W.,Knowledge Management and Data Mining in Biomedicine,Medical Informatics, 2005.

UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/machine-learning-databases/

Zhang, B,. A Joint Evolutionary Method Based on Neural Network for Feature Selection, Second International Conference on Intelligent Computation Technology and Automation(IEEE), 2009.

A. Verikasa,M. Bacauskiene, D. Valincius, A. Gelzinis,Predictor Output Sensitivity and Feature Similarity-Based Feature Selection,Science Direct : Fuzzy Sets and Systems, 2008 .

Akay, M. F.,Support Vector Machines Combined with Feature Selection for Breast Cancer Diagnosis,Science Direct: Expert Systems with Applications, 2009

How to Cite
Badriyah, T. (2013). Hybrid Modeling KMeans – Genetic Algorithms in the Health Care Data. EMITTER International Journal of Engineering Technology, 2(1), 63-74. https://doi.org/10.24003/emitter.v2i1.18