TITLE:
Lossless Compression of SKA Data Sets
AUTHORS:
Karthik Rajeswaran, Simon Winberg
KEYWORDS:
Square; Kilometre; Array; Lossless; Compression; HDF5
JOURNAL NAME:
Communications and Network,
Vol.5 No.4,
November
28,
2013
ABSTRACT:
With the size of
astronomical data archives continuing to increase at an enormous rate, the
providers and end users of astronomical data sets will benefit from effective
data compression techniques. This paper explores different lossless data
compression techniques and aims to find an optimal compression algorithm to
compress astronomical data obtained by the Square Kilometre Array (SKA), which
are new and unique in the field of radio astronomy. It was required that the
compressed data sets should be lossless and that they should be compressed
while the data are being read. The project was carried out in conjunction with the SKA
South Africa office. Data compression reduces the time taken and the bandwidth used when
transferring files, and it can also reduce the costs involved with data storage. The SKA uses the Hierarchical Data Format
(HDF5) to store the data collected from the radio telescopes, with the data
used in this study ranging from 29 MB to 9 GB in size. The compression techniques investigated in this study
include SZIP, GZIP, the LZF filter, LZ4 and the Fully Adaptive Prediction Error
Coder (FAPEC). The algorithms and methods used to perform the compression tests are
discussed and the results from the three phases of testing are presented,
followed by a brief discussion on those results.