A Method for Finding the Appropriate Number of Clusters

A Method for Finding the Appropriate Number of Clusters

Huan Doan and Dinh Nguyen

Department of Information System, University of Information Technology, Vietnam

Abstract: Drawback of almost partition based clustering algorithms is the requirement for the number of clusters specified at the beginning. Identifying the true number of clusters at the beginning is a difficult problem. So far, there were some works studied on this issue but no method is perfect in every case. This paper proposes a method to find the appropriate number of clusters in the clustering process by making an index indicated the appropriate number of clusters. This index is built from the intra-cluster coefficient and inter-cluster coefficient. The intra-cluster coefficient reflects intra-distortion of the cluster. The inter-cluster coefficient reflects the distance among clusters. Those coefficients are made only by extremely marginal objects of clusters. The looking for the extremely marginal objects and the building of the index are integrated in a weighted FCM algorithm and it is calculated suitably while the weighted Fuzzy C-Means (FCM) is processing. The Extended weighted FCM algorithm integrated this index is called Fuzzy C-Means-Extended (FCM-E). Not only does the FCM-E seek the clusters, but it also finds the appropriate number of clusters. The authors experiment with the FCM-E on some data sets of University of California, Irvine (UCI): Iris, Wine, Breast Cancer Wisconsin, and Glass and compare the results of the proposed method with the results of the other methods. The results of proposed method obtained are encouraging.

Keywords: Method for finding the number of clusters, appropriate a number of clusters, fuzzy c-means, clustering algorithm.

Received December 18, 2014; accepted March 3, 2016

Full text 

 
Read 1817 times
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…