Comments on NOST 100-1.2

Mon Jul 20 10:39:31 EDT 1998

Unless I am unusually dense (which always remains a possibility o
Monday mornings), the attached two paragraphs from Bill Pence's
message seem to be at odds with each other and do not really address
the issue John Davis was raising.

If John is right in his interpretation of what cfitsio does, things
are far from transparent to the programmer and we have a serious
problem - a problem that cannot be dismissed with "it's your
compiler's fault".  I tried his little test on a Sparc, using gcc
and, not surprisingly, it failed in the same way.  Hence, using gcc on
a 32-bit machine (hardly an uncommon combination) will get one into
trouble.

It means that anybody using the unsigned integer facilities in cfitsio
is at risk: all purportedly unsigned longs will come out one too small
on a common platform-compiler combination.  The fact that we have not
heard any complaints probably has to do with the rather recent
introduction of this feature which undoubtedly has not made it into a
lot of application code.

It would seem prudent, though less elegant, to test whether
BZERO/TZERO represents the signed/unsigned hack (yes, it _is_ a hack)
and then simply flip the most significant bit.

Could Bill comment onm this?  Do we have a serious problem or do we
misunderstand the way things are handled in cfitsio?

Thanks,

  - Arnold Rots

William Pence wrote:
> The more important issue for programmers, I believe, is that standard
> FITS software like the CFITSIO library should provide a transparent
> interface for reading and writing unsigned integers in FITS files.
> CFITSIO provides a whole family of routines for reading or writing data
> of any supported datatype, including unsigned short, unsigned int, and
> unsigned long (in the C language interface) to FITS images or tables. 
> When using CFITSIO, the programmer does not need to be concerned in any
> way with how the data values are internally represented in the FITS
> file.  All the business with setting or reading the BZERO keyword value,
> etc. is all handled internally by the interface routines.
> 
> Finally, John Davis raised the objection that the BZERO keyword =
> 2147483648 that is used to represent unsigned 32-bit integers in FITS is
> not a valid 32-bit signed integer with some particular compilers (it is
> a valid signed integer value with many other compilers, however).  This
> is not a problem with the FITS format itself, but instead just
> illustrates that the implementation of FITS readers and writers on any
> given platform must be able to deal with the limitations of that
> platform.  The similar sorts of issues have to be addressed when
> reading/writing FITS files in Fortran, which doesn't even support an
> unsigned integer datatype at all.

John E. Davis wrote:
> On 12 Jul 1998 10:43:53 -0400, Thierry Forveille <Thierry.Forveille at obs.ujf-grenoble.fr>
> wrote:
> >I must say I have never seen any convincing argument for supporting
> >unsigned values. The present "hack" as you call it offers all the
> >functionalities for no cost, while adding unsigned as an additional
> >supported format would cost significant extra code to everybody for no
> >additional value. To me Occam's razor says it should stay out...
> 
> Please correct me if I am wrong, but my main objection to this
> convention is that it is inherently non-portable because the BZERO
> value that must be used is outside the range of the integer data type.
> For example, to represent an unsigned 32bit integer, BZERO must be set
> to 2147483648 in the fits file, which is outside the range of a 32 bit
> signed integer.  Why is this a problem?  It is a problem because
> parsing it as a 32 bit integer may result in its truncation.
> Specifically, CFITSIO uses the strtol function call to parse these
> values and this function will truncate 2147483648 to 2147483647.
> 
> To see this, consider:
> 
> #include <stdio.h>
> #include <stdlib.h>
> 
> int main ()
> {
>    long i;
>    char *s;
> 
>    s = "2147483648";
>    i = strtol (s, NULL, 10);
>    fprintf (stdout, "%s = %ld\n", s, i);
> 
>    s = "2147483647";
>    i = strtol (s, NULL, 10);
>    fprintf (stdout, "%s = %ld\n", s, i);
>    
>    return 0;
> }
> 
>      
> Compiled with gcc on my Linux system, this produces:
> 
>    2147483648 = 2147483647
>    2147483647 = 2147483647
> 
> The man page for strtol indicates:
> 
> RETURN VALUE
>        The  strtol()  function  returns the result of the conver¡
>        sion, unless the value would underflow or overflow.  If an
>        underflow  occurs, strtol() returns LONG_MIN.  If an over¡
>        flow occurs, strtol() returns LONG_MAX.   In  both  cases,
>        errno is set to ERANGE.
> 
> It is possible avoid strtol with a different function, but one would
> still need to perform integer multiplications which would result in
> overflow.  About the only portable alternative would be to use double
> precision arithmetic but even that may not be sufficient if you want
> to use this convention to represent 64 bit unsigned values.
> 
> So, without real support for unsigned types, I maintain that this
> convention is flawed.
> 
> Thanks,
> --John
> 
> 

--------------------------------------------------------------------------
Arnold H. Rots                                         AXAF Science Center
Smithsonian Astrophysical Observatory                tel:  +1 617 496 7701
60 Garden Street, MS 81                              fax:  +1 617 495 7356
Cambridge, MA 02138                             arots at head-cfa.harvard.edu
USA                                     http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------