[fitsbits] Output array type when BZERO is an integer {External}

Barrett, Paul pebarrett at email.gwu.edu
Mon Mar 11 14:11:35 EDT 2024


The reason that I asked this question is because I had a similar problem
with the radio astronomy's Measurement Set (MS) data format. I spent a lot
of time trying to decipher the C++ code in order to understand the format,
because there was no document that formalized the standard. The standard is
the code. In several cases, I was able to decipher the hundred or so lines
of C++ code to find out that I could provide the same functionality in a
single line of code. I eventually gave up. So my point is that code should
not be the standard: a clear and concise document should be. If you want
people to use it, then it is incumbent on the proponents to produce such a
document. Just as it is incumbent on me as a proponent of Julia to provide
the necessary software to do astronomy. I am sorry to say that I am not a
proponent of FITS, so I don't believe that it is incumbent on me to produce
such a document. I'm writing FITS.jl out of necessity so that I and others
can do science. Personally, I would prefer a more modern data format.

FITS.jl will eventually support tile compression, assuming that it is
documented in the standard and I can understand it.

Here is another suggestion for improving the document. Group keywords, both
required and optional, as they might appear in a heterogeneous or composite
data type (i.e., a C struct). All modern languages have such data types. By
discussing them as a group and showing how their presence or absence and
values affect their behaviour, it will make it easier to understand the
standard and provide hints to the developer. This is what I do when reading
through the standard. I try to determine which keywords belong together in
a composite type and how keywords modify the behaviour of that type. This
will make it easy for any computer scientist to implement new code, because
that is the way they are trained to think.

 -- Paul

On Mon, Mar 11, 2024 at 1:01 PM Dubois-Felsmann, Gregory P. <
gpdf at ipac.caltech.edu> wrote:

> Hi, Paul,
>
> I agree entirely with you that these matters should be formally
> clarified.  I don't have a lot of experience turning the crank of the
> FITS-standard engine, however, so I'd be hard-pressed to give you an
> estimate on how long this might take.  I do note that the history on
> https://fits.gsfc.nasa.gov/fits_standard.html indicates that the last
> time a point release was created was 19 years ago, and there's no recent
> history of "errata" or similar clarifications.
>
> I may be over-interpreting what either you or Rob said in this thread, but
> I didn't think Rob was suggesting in "other languages and libraries should
> start with the CFITSIO source for appropriate usage" that you wrap CFITSIO
> in Julia, but rather that you use it as a de-facto guide to the resolution
> of ambiguities.  I can't whole-heartedly endorse that, because I don't
> think the pressure should be taken off the standard to evolve to become
> more precise, but it is a realistic suggestion for you to use in order to
> keep working on your implementation: CFITSIO is very heavily used and I
> would judge it to be relatively unlikely that the community would choose a
> clarification to the standard that was inconsistent with its behavior in
> any mainstream situation.
>
> Gregory
>
> P.S.  I certainly agree with Rob in strongly encouraging you to implement
> tile compression.  Both of my projects (Rubin and SPHEREx) will generate
> public data products with compressed image extensions.
>
> --
> Gregory Dubois-Felsmann | Senior Staff Scientist | Caltech/IPAC
> Science Platform Scientist, Vera C. Rubin Observatory
> Pipeline System Designer, NASA SPHEREx mission
> Mail Code MR 100-22 | Pasadena, CA 91125-2200 | gpdf at ipac.caltech.edu
>
>
>
> ________________________________________
> From: Barrett, Paul <pebarrett at email.gwu.edu>
> Sent: Monday, March 11, 2024 09:34
> To: Dubois-Felsmann, Gregory P.
> Cc: fitsbits at listmgr.nrao.edu
> Subject: Re: [fitsbits] Output array type when BZERO is an integer
> {External}
>
> Greg,
>
> Thanks for clarifying the impact of this issue. You have clarified several
> points. It appears that my understanding of the document is different from
> what you have described for some keyword cases.
>
> By explicitly specifying the type of the output array or the minimum type
> of the output array in the document, it makes it much easier to implement
> libraries for new languages. In addition, by specifying the behaviour when
> a keyword is present or absent would also make it easier to implement.
>
> Julia is a very concise, yet high performance, programming language, so it
> doesn't require a lot of code to implement the FITS standard. Because of
> this, I spend more time trying to understand the FITS standard than I do
> writing the code. This should not be the case. There are two reasons for
> having a native Julia FITS package. First, Julia has an excellent package
> manager, which makes it easy to install and maintain libraries or packages.
> It can be cumbersome to have to manage and maintain packages that wrap
> libraries in other languages. Second, Julia is a concise high performance
> array language, so its performance is comparably and likely to be better
> than C/C++ or FORTRAN with fewer lines of code. Because development of the
> package is in its early stages, I have not benchmarked it against CFITSIO,
> but I would not be surprised to see that it is faster in most cases.
>
>  -- Paul
>
>
>
> On Sun, Mar 10, 2024 at 5:41 PM Dubois-Felsmann, Gregory P. <
> gpdf at ipac.caltech.edu<mailto:gpdf at ipac.caltech.edu>> wrote:
> This is a bit of a logical/legal hole in the FITS standard, for a couple
> of reasons given below.  I agree with Paul that the best solution would be
> to issue a clarification.
>
> I think it's an excellent moment for getting this right, when implementing
> a new client library from scratch instead of adiabatically as most of our
> others have been.
>
> These questions also arise in the behavior of client applications -- for
> instance, IPAC Firefly, which I have some responsibility for -- so this
> isn't purely an issue for theoretical discussion.
>
> 1) Lack of clarity about the interpretation of missing headers and when
> the scaling should be applied at all
>
> The FITS 4.00 standard specifies that BZERO and BSCALE both "shall" be
> floating point values, with defaults of "0.0" and "1.0", respectively --
> and there's no discussion of the absence of BZERO and BSCALE being treated
> as a special case.  Thus, if you take this completely literally, it means
> that the inherently floating-point scaling operation is *always* performed
> (even when it's mathematically a no-op) and that the result should
> therefore *always* be a floating-point array.  That is obviously not the
> spirit of the standard!  It has to be possible to deliver an integer array
> as the output, which means we need a specification of how to trigger that.
>
> Even then we are left to ask the question: if the absence of BZERO and
> BSCALE is supposed to trigger returning the array as its specified integer
> format, what is *explicit* specification of BZERO = 0.0 and BSCALE = 1.0
> supposed to do?  The most legalistic reading of the standard is that the
> absence of a keyword and the explicit presence of the default value for
> that keyword are supposed to be treated the same way, but then we have
> define what it means for the explicitly specified value to be the same as
> the default.  The text says "1.0"; which of the following are equivalent:
> "1", "1.", "+1.0D0", "1.000000001" (equal to 1 in float32),
> "1.0000000000000000000000001", "0.99999999999999999999999999"?
>
> If, as I do, you feel queasy about testing for floating-point equality in
> this situation, and you think "OK, the rule we should publish is:
>
> 'if either BZERO or BSCALE are present in the header, even if they are
> exactly 0.0 and 1.0 respectively, return the array in a floating-point
> format in which all possible values of the input are distinct in the
> output, if possible'"
>
> (meaning that BITPIX 32 would yield a float64 output, and BITPIX 64 would
> have to be annotated as an exception, since we can't rely on a float128
> being available), you have something that sounds defensible, but you still
> have another problem:
>
> 2) The unsigned-integer (and signed byte) special case
>
> The standard also recommends the use of special values of BZERO to allow
> representation of unsigned 16/32/64-bit integers (and also signed 8-bit
> integers):
>
> "... the BZERO keyword is also used when storing unsigned-integer values
> in the FITS array. In this special case the BSCALE keyword *shall* have the
> default value of 1.0, and the BZERO keyword *shall* have one of the integer
> values shown in Table 11."
>
> Again, the spirit of this is obviously that if BSCALE is present with the
> value 1.0, and BITPIX is 32, say, and BZERO is 2147483648, the returned
> data should have type uint32, not some floating-point type.  What is a
> client library supposed to do if BZERO is 2147483648.0?  The same?  What if
> it's 2147483648.0000000000000000000000000001?  (In other words, is OK if
> the client library reads in the RHS of the BZERO header into an internal
> float64 first, before interpreting it, or is it supposed to handle the RHS
> of the BZERO header as a string and compare it only to exactly the value in
> Table 11?
>
> Note that in this case the standard actually appears to say point blank
> that a BSCALE of 1.0 *shall* be supplied; it certainly doesn't say, e.g.,
> "the BSCALE keyword shall be omitted".
>
> NB: I have not tried to search the fitsbits archive -- I would not be at
> all surprised if this had come up before.
>
> Gregory
>
> ________________________________________
> From: fitsbits <fitsbits-bounces at listmgr.nrao.edu<mailto:
> fitsbits-bounces at listmgr.nrao.edu>> on behalf of Barrett, Paul via
> fitsbits <fitsbits at listmgr.nrao.edu<mailto:fitsbits at listmgr.nrao.edu>>
> Sent: Saturday, March 9, 2024 11:20
> To: fitsbits at listmgr.nrao.edu<mailto:fitsbits at listmgr.nrao.edu>
> Subject: [fitsbits] Output array type when BZERO is an integer {External}
>
> I'm writing a FITS package for the Julia programming language. I have a
> question about the output type of the image when BZERO is an integer value.
> The documentation implies that the output image should be a floating point
> type because the BSCALE value is a float. Is this correct? If yes, then I
> recommend stating this explicitly in the FITS standard documentation. I
> also recommend suggesting the appropriate output type depending on the
> input type, e.g., UInt8 => Float32, Int16 => Float32, etc.
>
> Thanks,
> Paul
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/fitsbits/attachments/20240311/2a6ff02c/attachment-0001.html>


More information about the fitsbits mailing list