VParC: A Compression Scheme for Numeric Data in Column-oriented Databases

VParC: A Compression Scheme for Numeric Data in Column-oriented Databases

Ke Yan1, Hong Zhu1 and Kevin Lü2

1School of Computer Science and Technology, Huazhong University of Science and Technology, China

2Brunel University, UK

Abstract: Compression is one of the most important techniques in data management, which is usually used to improve the query efficiency in database. However, there are some restrictions on existing compression algorithms that have been applied to numeric data in column-oriented databases. First, a compression algorithm is suitable only for columns with certain data distributions not for all kinds of data columns; second, a data column with irregular distribution is hard to be compressed; third, the data column compressed by using heavyweight methods cannot be operated before decompression which leads to inefficient query. Based on the fact that it is more possible for a column to have sub-regularity than have global-regularity, we developed a compression scheme called Vertically Partitioning Compression (VParC). This method is suitable for columns with different data distributions, even for irregular columns in some cases. The more important thing is that data compressed by VParC can be operated directly without decompression in advance. Details of the compression and query evaluation approaches are presented in this paper and the results of our experiments demonstrate the promising features of VParC.

 

Keywords: Column-stores, data management, compression, query processing, analytical workload.

Received August 28, 2013; accepted 21 April, 2014

Full Text 

 

 

Read 1775 times Last modified on Monday, 09 March 2015 03:32
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…