[fitsbits] image compression

William Pence pence at tetra.gsfc.nasa.gov
Wed Jan 30 13:27:59 EST 2002


                         CFITSIO v2.401

This announcement describes the new image compression features that
are supported in the latest 2.401 version of the CFITSIO library for
reading and writing FITS format data files.

  ===============================================================

SUMMARY:

CFITSIO now transparently supports 2 types of image compression:

1)  The entire FITS file may be externally compressed with the gzip or
Unix compress algorithm, producing a *.gz or *.Z file, respectively.
When reading compressed files of this type, CFITSIO first uncompresses
the entire file into memory before performing the requested read
operations.  Output files will be written directly in the gzip
compressed format if the user-specified filename ends with `.gz'.  In
this case, CFITSIO initially writes the uncompressed file in memory and
then compresses it and writes it to disk when the FITS file is closed,
thus saving user disk space.

2) CFITSIO also supports a newer image compression format in which the
image is divided into a grid of rectangular tiles, and each tile of
pixels is individually compressed.  The compressed tiles are stored in
rows of a variable length array column in a FITS binary table, but
CFITSIO recognizes that the binary table extension contains an image
and treats it as if it were an IMAGE extension.  This tile-compressed
format is especially well suited for compressing very large images
because a) the FITS header keywords remain uncompressed for rapid read
access, and because b) it is possible to extract and uncompress
sections of the image without having to uncompress the entire image.
This format is also much more effective in compressing floating point
images (using a lossy compression algorithm) than simply compressing
the image using gzip or compress.

Most existing programs that use CFITSIO to read and write FITS images
will inherit these new image compression capabilities when compiled
and relinked with this new version of CFITSIO.

A small demonstration program called 'imcopy' is included with CFITSIO
that can be used to compress (or uncompress) any FITS image.  This
program can be used to experiment with the various compression options
on existing FITS images.

  ===============================================================

FURTHER DETAILS:

The CFITSIO library is available from:
      http://heasarc.gsfc.nasa.gov/fitsio/

The new tile-compressed image format was developed by White,
Greenfield, Pence, and Tody and was presented at the October 1999 ADASS
meeting.  A detailed description of this data format is available at:

    http://heasarc.gsfc.nasa.gov/docs/software/fitsio/
           compression/compress_image.html

The N-dimensional FITS image can be divided into any desired
rectangular grid of compression tiles.  By default the tiles are chosen
to correspond to the rows of the image, each containing NAXIS1 pixels.
For example, a 800 x 800 x 4 pixel data cube would be divided in to
3200 tiles containing 800 pixels each by default.  Alternatively, this
data cube could be divided into 256 tiles that are each 100 X 100 X 1
pixels in size, or 4 tiles containing 800 x 800 X 1 pixels, or a single
tile containing the entire data cube.

Currently, 3 image compression algorithms are supported:  Rice, GZIP,
and PLIO.  Rice and GZIP are general purpose algorithms that can be
used to compress almost any image.  The PLIO  algorithm is more
specialized and was developed for use within IRAF to store pixel data
quality masks. Support for other image compression algorithms may be
added in the future.

The 3 supported image compression algorithms are all 'loss-less' when
applied to integer FITS images;  the pixel values are preserved exactly
with no loss of information during the compression and uncompression
process. Floating point FITS images (which have BITPIX = -32 or -64)
are first be quantized into scaled integer pixel values before being
compressed.  This technique produces much higher compression factors
than simply using GZIP to compress the image, but it also means that
the original floating value pixel values may not be precisely returned
when the image is uncompressed.  When done properly, this only discards
the 'noise' from the floating point values without losing any
significant information.  The amount of noise that is discarded can be
controlled by the 'noise_bits' compression parameter.

No special action is required to read tile-compressed FITS images
because all the CFITSIO routines that read normal uncompressed FITS
images can also read images in the tile-compressed format;  CFITSIO
essentially treats the binary table that contains the compressed tiles
as if it were an IMAGE extension.

When creating (writing) a new image with CFITSIO, a normal uncompressed
FITS primary array or IMAGE extension will be written unless the
tile-compressed format has been specified in 1 of 2 possible ways:

1)  At run time, when specifying the name of the output FITS file to be
created, the user can indicate that images should be written in
tile-compressed format by enclosing the compression parameters in
square brackets following the root disk file name. Here are a couple
examples of the syntax for specifying tile-compressed output images:

    myfile.fit[compress]    - use the default compression algorithm (Rice)
                              and the default tile size (row by row)

    myfile.fit[compress GZIP 100,100]   - use GZIP compression and 
                                          100 x 100 pixel tile size

2)  Before calling the CFITSIO routine to write the image header
keywords (e.g., fits_create_image) the programmer can call a CFITSIO
routine to specify the compression algorithm and the tiling pattern
that is to be used when writing the image.

  ===============================================================

How to use the imcopy demonstration program (on Unix):

Unpack the CFITSIO distribution .tar file, then build the CFITSIO
library and the imcopy program with:

>  ./configure
>  make
>  make imcopy

The imcopy program can be used to compress any existing FITS image as
shown in these examples:

1)  imcopy infile.fit 'outfile.fit[compress]' 

       This will use the default compression algorithm (Rice) and the
       default tile size (row by row)

2)  imcopy infile.fit 'outfile.fit[compress GZIP]' 

       This will use the GZIP compression algorithm and the default
       tile size (row by row).  The allowed compression algorithms are
       Rice, GZIP, and PLIO.  Only the first letter of the algorithm
       name needs to be specified.

3)  imcopy infile.fit 'outfile.fit[compress G 100,100]' 

       This will use the GZIP compression algorithm and 100 X 100 pixel
       tiles.

4)  imcopy infile.fit 'outfile.fit[compress R 100,100; 4]' 

       This will use the Rice compression algorithm, 100 X 100 pixel
       tiles, and noise_bits = 4 (assuming the input image has a
       floating point data type).  Decreasing the value of noisebits
       will improve the overall compression efficiency at the expense
       of losing more information.

5)  imcopy infile.fit outfile.fit

       If the input file is in tile-compressed format, then it will be
       uncompressed to the output file.  Otherwise, it simply copies
       the input image to the output image.

6)  imcopy 'infile.fit[1001:1500,2001:2500]'  outfile.fit

       This extracts a 500 X 500 pixel section out of the much larger
       input image (which may be in tile-compressed format).  The
       output is a normal uncompressed FITS image.

7)  imcopy 'infile.fit[1001:1500,2001:2500]'  outfile.fit.gz

       Same as above, except the output file is externally compressed
       using the gzip algorithm.

-- 
____________________________________________________________________
Dr. William Pence                          pence at tetra.gsfc.nasa.gov
NASA/GSFC Code 662         HEASARC         +1-301-286-4599 (voice)     
Greenbelt MD 20771                         +1-301-286-1684 (fax)



More information about the fitsbits mailing list