[fitsbits] 64-bit integers
William Pence
William.D.Pence at nasa.gov
Wed May 11 17:16:30 EDT 2005
Thierry Forveille wrote:
> LC's No-Spam Newsreading account writes:
> > ----------------------------------------------------------------------
> > 4) do we need 64-bit array descriptor pointers 1Qt
> > ----------------------------------------------------------------------
> >
> > Here again I'm quite perplexed. A 64-bit pointer seems something
> > designed to address positions in large datasets (possibly too large to
> > be practically "transported" using FITS files ... after all T is for
> > transport).
> >
> Well, 32 bits only gets you to 2 Gb (at most 4 Gb if using tricks to
> get back the sign bit), which nowadays is not a spectacularly large
> dataset. We regularly transfer files (mostly images) that are about
> that big over the network. For now we try to stay under 2 Gb because
> a few of our customers still have filesystems that don't deal with
> bigger files, but that's definitely a restriction which we'd like
> to lift fairly soon.
Currently, the FITS Standard places almost no limit on the allowed size of a
FITS file; the main reason some groups try to limit the file sizes to less
than 2.1 GB (2**31 bytes) is because some operating systems do not support
them very well (which is still true to some extent). It is perfectly legal
now to create FITS files with integer keywords larger than the range of a 32
bit integer. In particular, there is no limit on the size of the NAXISn
keywords, so images with more than 2**32 pixels in any dimension, or a table
with more than this many rows are allowed. The only place where the current
FITS standard restricts the size of a FITS file is in the size of the binary
table heap because the 'P' variable length array descriptors only contain
32-bit addresses. The proposed 'Q' descriptor, with 64-bit addressing,
would allow a much larger heap to be used. Otherwise, the proposal to add
support for the 64-bit integer data type in FITS images and tables has
little to do with the size of the FITS file itself.
On another related topic, Lucio Chiappetti observed that the proposal (see
http://fits.gsfc.nasa.gov/fits_64bit.html) changes the definition of the 'P'
and 'Q' descriptor values from 'signed integers' to 'unsigned integers'.
Here is the history behind this change:
1. In the original binary table definition paper (and in the Appendix B of
the FITS Standard), the descriptors were only defined to be '32-bit
integers' without specifying the signed-ness.
2. When the FITS community voted earlier this year to incorporate the
description of variable length arrays into the official FITS standard, it
was decided that we should resolve this ambiguity and explicitly state
whether the descriptor is signed or unsigned. We conservatively decided to
define the descriptors as 'signed integers' because a) there were no FITS
implementations that supported unsigned descriptor values at that time, and
b) there was no precedent for directly supporting unsigned integers anywhere
in a FITS file. This decision was made even though it cuts the maximum
allowed size of the heap in half (from about 4.2 GB down to 2.1 GB).
3. A couple of months ago, CFITSIO was modified to demonstrate the
feasibility of supporting unsigned integer descriptors. This turned out to
be trivial to implement. It provides a practical demonstration that there
is no difficulty in supporting the larger heap size. Since there was never
any practical use for a negative signed descriptor value, it seems desirable
to reverse the decision made 6-months ago and now define the descriptors to
be unsigned integers, thus doubling the maximum size of the heap. This
change cannot affect or invalidate any existing FITS files so it does not
violate the 'once FITS always FITS rule'.
Bill Pence
--
____________________________________________________________________
Dr. William Pence William.D.Pence at nasa.gov
NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice)
Greenbelt MD 20771 +1-301-286-1684 (fax)
More information about the fitsbits
mailing list