Identification of Promoter Region in Genomic DNA Using Cellular Automata Based Text Clustering
Kiran Sree1 and Ramesh Babu2
1Department of Computer Science, Jawaharlal Nehru Technological University, India
2Department of Computer Science, Acharya Nagarjuna University, India
1Department of Computer Science, Jawaharlal Nehru Technological University, India
2Department of Computer Science, Acharya Nagarjuna University, India
Abstract: Identifying the promoter regions play a vital role in understanding human genes. This paper presents a new cellular automata based text clustering algorithm for identifying these promoter regions in genomic DNA. Experimental results confirm the applicability of cellular automata based text clustering algorithm for identifying these regions. We also note an increase in accuracy of fining these promoter regions by 12 percent for DNA sequences for shorter length. This algorithm was trained to identify promoter regions in mixed and overlapping DNA sequences also. However this algorithm fails in identifying the promoter regions of length greater than 54. This algorithm will be also used to predict the RNA structure.
Keywords: Text clustering, DNA sequence, cellular automata.
Received May 10, 2008; accepted October 1, 2008