[fitsbits] Potential new compression method for FITS tables

Fri Oct 29 19:46:51 EDT 2010

On Oct 28, 2010, at 1:42 PM, William Pence wrote:

> For the past few months, several of us (Rob Seaman, Rick White, and myself) have been experimenting with a new compression method for FITS binary tables that appears to be significantly more effective than the usual method of simply compressing the whole FITS file with gzip.  We have produced a document, available at http://fits.gsfc.nasa.gov/tiletable.pdf that describes this proposed convention in more detail;

"Tables" are a broad class of data structure.  So far we have focused on general tabular structures such as catalogs that are suitable fodder for lossless compression algorithms.

Another class of table is a spectrum that might associate a flux vector with vectors for wavelength and variance, for instance.  As with imaging data, this may well be appropriate for compression via a lossy technique.  The initial draft of the convention doesn't focus on such cases, but it's worth investigating the community interest.

On the other hand, the GZIP_2 algorithm that shuffles the input data vector may well be useful for the case of lossless compression of floating-point images in the prior tiled-image convention.  Reduced data products such as calibrated spectra or image pipeline output can benefit from the (technically) lossy compression ("efficient representation") of floating-point pixels.  However some cameras, especially in the infrared, produce *raw* floating-point images as the result of internal coadditions or sampling techniques via non-destructive reads.  Raw data of whatever type is typically required to be losslessly preserved.

One size does not fit all.

Rob Seaman
NOAO