Self-Organizing Map vs Initial Centroid Selection Optimization to Enhance K-Means with Genetic Algor

Self-Organizing Map vs Initial Centroid Selection Optimization to Enhance K-Means with Genetic Algorithm to Cluster Transcribed Broadcast News Documents

Ahmed Maghawry1, Yasser Omar1, and Amr Badr2

1Department of Computer Science, Arab Academy for Science and Technology, Egypt

2Department of Computer Science, Cairo University, Egypt

Abstract: A compilation of artificial intelligence techniques are employed in this research to enhance the process of clustering transcribed text documents obtained from audio sources. Many clustering techniques suffer from drawbacks that may cause the algorithm to tend to sub optimal solutions, handling these drawbacks is essential to get better clustering results and avoid sub optimal solutions. The main target of our research is to enhance automatic topic clustering of transcribed speech documents, and examine the difference between implementing the K-means algorithm using our Initial Centroid Selection Optimization (ICSO) [16] with genetic algorithm optimization with Chi-square similarity measure to cluster a data set then use a self-organizing map to enhance the clustering process of the same data set, both techniques will be compared in terms of accuracy. The evaluation showed that using K-means with ICSO and genetic algorithm achieved the highest average accuracy.

Keywords: Clustering, k-means, self-organizing maps, genetic algorithm, speech transcripts, centroid selection.

Received May 21, 2017; accepted July 10, 2018
https://doi.org/10.34028/iajit/17/3/5
Full text     
Read 2197 times Last modified on Thursday, 30 April 2020 10:18
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…