[fitsbits] 64-bit integers

Tue May 10 19:37:44 EDT 2005

LC's No-Spam Newsreading account writes:
 > Support to e.g. unsigned quantities is less uniform (e.g. possible but 
 > not native in java, not supported in Fortran), so I suggest we start 
 > dropping consideration of unsigned quantities.
 > 
Agreed, but as far as I know there is not any current suggestion to introduce 
unsigned integers. We don't support them for 16 and 32 bits integers, so
64 bits should follow suit. The unsigned issue does come up every few years,
but every time the conclusion has been that the very minor efficiency gain
(compared with signed+offset) is not worth the complication. If desired
by some we could have that discussion again, but I see that as a separate 
issue (on which I have strong opinions :-)).

 > Said that, I'm a keen supporter of Ockham's razor (in the particular 
 > forms "data types non sunt multiplicanda praeter necessitatem" :-), and 
 > so we should consider if the request of 64-bit integers do not go 
 > "beyond necessity".
 > 
Strongly agreed.

 > An INTEGER*8 allows to store a number (in the 2**+/-63 range) with 19 
 > digits of precision. This should be compared with 10 digits for an 
 > INTEGER*4 (in the 2**-31 range) or 15 digits offered by a REAL*8 (in a 
 > broader range but not with complete precision).
 > 
 > Really I fail to see a case for which one'd *really* need such absolute 
 > precision for a count image.
 > 
Note that with BZERO and BVAL, BITPIX=64 need not be a count image. The
additional phase space does not bring up any more use case to my mind,
though ;-)

 > ----------------------------------------------------------------------
 > 4) do we need 64-bit array descriptor pointers 1Qt
 > ----------------------------------------------------------------------
 > 
 > Here again I'm quite perplexed. A 64-bit pointer seems something 
 > designed to address positions in large datasets (possibly too large to 
 > be practically "transported" using FITS files ... after all T is for 
 > transport).
 > 
Well, 32 bits only gets you to 2 Gb (at most 4 Gb if using tricks to 
get back the sign bit), which nowadays is not a spectacularly large 
dataset. We regularly transfer files (mostly images) that are about
that big over the network. For now we try to stay under 2 Gb because
a few of our customers still have filesystems that don't deal with
bigger files, but that's definitely a restriction which we'd like
to lift fairly soon. 

As Harro mentioned, VLBI tables easily go over that limit. ALMA
certainly will as well. In a different context, a catalog with 200
bytes per row (for, say, a 5-colour data set) can only go to 100 000
elements, way below the galaxy counts in surveys like the SDSS
or the CFHTLS. Catalogs can obviously be split into smaller pieces and
put back together, but that's a management pain.

 > Also I noticed that 8.3.5 originally described array descriptors as 
 > "integers".
 > 
 > In the recent (IAUFWG approved) change it has been added a specification 
 > of "signed integers".
 > 
 > Now the proposed new change to support Q pointers reverses this wording 
 > and says "unsigned integers".
 > 
 > I would like to see a motivation of this reversal.
 > 
 > Also noting that unsigned quantities are otherwise only indirectly 
 > supported in FITS for all (old or new) integer quantities.
 > 
I would assume that the mention of unsigned integers for Q pointers
is a typo/mistake. Otherwise I'd strongly agree with you.

 > I've done some timing analysis myself, but not really fast timing, and 
 > in such cases I always managed to do things like : reading in Fortran a 
 > portion of spacecraft clock into an I*4, cast it into a REAL*8 using a 
 > C-wrapper (around udouble), scaling it by a 2**LSB resolution, 
 > adding/subtracting an offset and working with a REAL*8 quantity in 
 > seconds or milliseconds elapsed from a fixed time (usually 0 UT of the 
 > day when the observation began)
 > 
I am not a timer either, but my impression from previous discussions
is that REAL*8 does not have sufficient resolution to keep time over
several years at the precision needed for long term phasing. The alternative
to I*8 is to store time as a pair of I*4 (LSB and HSB of the clock, or ticks 
since midnight and day number, or whatever...), but that ends up very specific
to individual instruments or datasets.