Referential DNA Data Compression using
Hadoop Map Reduce Framework
Raju Bhukya and Sumit Deshmuk
Department of Computer Science and
Engineering, National Institute of Technology, India
Abstract: The indispensable knowledge of Deoxyribonucleic
Acid (DNA) sequences and sharply reducing cost of the DNA sequencing techniques
has attracted numerous researchers in the field of Genetics. These sequences
are getting available at an exponential rate leading to the bulging size of
molecular biology databases making large disk arrays and compute clusters
inevitable for analysis.In this paper, we proposed
referential DNA data compression using hadoop MapReduce Framework to process
humongous amount of genetic data in distributed environment on high performance
compute clusters. Our method has successfully achieved a better balance between
compression ratio and the amount of time required for DNA data compression as
compared to other Referential DNA Data Compression methods.
Keywords: Compression,
map reduce sequences, dna sequences.
Received August 12, 2017; accepted April 17, 2018
https://doi.org/10.34028/iajit/17/2/8