[fitsbits] Associating ancillary data with primary HDU

Fri May 9 18:08:05 EDT 2014

Hi Bill,

> I've been asked if there are any specific conventions for associating ancillary 
> data with primary data arrays.  The specific application is one where the 
> exposure time differs from pixel to pixel (something that can be done with 
> Active Pixel Sensors), but which could easily apply to other parameters which 
> vary between pixels.
> 
> The simplest and most obvious approach would be to store the actual data in the 
> primary HDU, and then store the exposure times in an extension with the same 
> dimensionality.

As others have suggested, you might consider a dateless primary HDU, though the original Image extension proposal had this whimsical language:

>> {\tt NAXIS = 0} is not recommended since it would not make sense to extend a non-existing image with another image.

At NOAO and elsewhere the opposite choice has been made since day one as providing a quite natural hierarchical structure to an MEF.

(I’d be interested in R. Thompson’s comment on this ;-)

> Are there any additional conventions that are appropriate to this situation?

This file structure maps well to tile compression (though bin tables could be used as well with the current FPACK).  You have short ints for the pixels and single precision floating point for the exposure times, thus two-thirds of each file will be timekeeping metadata (or at least the independent variable if not deemed meta).  I see no reason that the default lossy compression setting of q=4 wouldn’t be appropriate for the exposure time.  This corresponds to R ~= 6 which for a typical pixel noise model would drop the metadata overhead down to about 40% of the compressed file size. **  Or if the requirement on the precision of the exposure time is less stringent the extension could be compressed much more aggressively than that.

As you suggest, similar reasoning applies to per-pixel masks or variance arrays, etc.

Rob

** On the other hand if the project insists on lossless compression for the floating point extensions, the disparity becomes worse for the compressed data since the integer pixels will benefit from Rice compression and the floating-point exposure times will default to tile-compressed gzip which is less efficient in both time and space.