[fitsbits] FITS 'P' descriptors: signed or unsigned?
Doug Tody
dtody at nrao.edu
Wed Jun 15 18:41:16 EDT 2005
Hi Bill -
Unless someone can come up with a compelling reason why this causes a
technical problem I would support it (using unsigned for 32-bit pointers).
The 2 GB data size limit is getting to be a major problem which we have
to deal with. The real solution is 64-bit support, but a factor of 2 for
something like this makes a big difference. The only issue I can see is
that older programs not expecting unsigned would interpret such offsets
has having a negative value and probably reject the file. In the worst
case (software fails to check for a negative value) a pointer error could
occur and invalid data could be returned.
- Doug
On Wed, 15 Jun 2005, William Pence wrote:
> This note concerns a relatively small technical issue in the larger proposal
> to add 64-bit integer support to FITS:
>
> At issue is whether to reverse the recent decision to define the 'P'
> variable-length array descriptors in FITS binary tables to be a pair of
> 'signed 32-bit integers', and make them 'unsigned 32-bit integers' instead.
> The first integer gives the number of elements in the array, and the 2nd
> integer gives the byte offset in the heap to the first element of the array.
> The practical consequence of this change is that it will double the
> allowed heap size from about 2.1 GB to about 4.2 GB.
>
> This is not just a theoretical issue because there are existing applications
> that can easily produce FITS files with binary table heaps larger than 2.1
> GB (e.g., using the 'tiled' image compression convention where the
> compressed rows of an image are stored in a variable length array table
> column). Allowing this extra factor of 2 in size will benefit software
> applications that would otherwise need to be rewritten to use the proposed
> 'Q' 64-bit descriptors (assuming that the 'Q' type is eventually approved by
> the FITS committees). There are no technical reasons not to support
> unsigned descriptor values (e.g., it is impossible to have negative
> descriptors). Forcing the descriptors to be signed 32-bit integers
> artificially cuts in half the potential size of the heap.
>
> The main argument for keeping the descriptors as signed integers is that
> FITS has never supported unsigned integers as a raw data type (although it
> does support unsigned integers by applying an offset to the FITS signed
> integer values). Thus, it is argued, the definition of FITS remains more
> 'pure' if we don't introduce unsigned integers in this case. There is
> however a real distinction between the array descriptor values and the other
> FITS table column data types because the descriptor values themselves are
> almost never directly accessible at the application software level.
> Instead, the descriptor values are only used by the low-level FITS interface
> software routines, when accessing the arrays that the descriptor points to.
>
> I don't consider this to be a major issue, but given a choice, I think the
> practical advantages of doubling the allowed size of the heap out weighs the
> more intangible 'purity of FITS' issue.
>
> How do others feel about this issue? Is there a clear consensus one way or
> the other? Should the FITS committees be explicitly asked to vote on a
> preference?
>
> This issue does not affect the proposed 'Q' 64-bit descriptors, because
> even signed 64-bit integers provide vastly more address space than could
> conceivably be used by any applications in the foreseeable future.
> Presumably the sign of the 'Q' descriptors should be defined to be the same
> as whatever is decided for the 'P' descriptors.
>
> As a final note, to put this in historical perspective, the original FITS
> binary table definition paper did not specify the sign of the descriptor
> integers. It was only when the variable-length array convention was
> approved by the FITS committees earlier this year that the wording was made
> more rigorous to define the sign. The reason for choosing 'signed' rather
> than 'unsigned' was mainly because at the time there did not exist any
> software implementations that supported unsigned descriptor values.
> Subsequently, some FITS libraries (e.g., CFITSIO) have been enhanced to
> support unsigned descriptor values. If we make this change now, it will
> reverse a decision that was only finally approved in April 2005. Also, it
> will not invalidate any existing FITS files, because the positive, signed
> descriptor values can always be treated as unsigned integers.
>
> Bill Pence
>
More information about the fitsbits
mailing list