[fitsbits] Proposed Changes to the FITS Standard
William Pence
pence at milkyway.gsfc.nasa.gov
Fri Aug 17 15:55:47 EDT 2007
Rob Seaman wrote:
> Bill said:
>
>> The "once FITS always FITS" philosophy captures the spirit of FITS,
>> but
>> in practice each new version of the FITS Standard has imposed new
>> requirements that in principle could invalidate existing FITS files.
>> For example, version 2.0 of the FITS Standard introduced a new
>> requirement that the value and comment fields in a keyword MUST be
>> separated by a slash character.
>
> It would be interesting to review past such instances. I don't
> personally recall changes of this mandatory nature. The example
> regarding comments is pretty tame since any reasonable implementation
> would already be ignoring the comments. Do you have another example
> to quote?
Some other new requirements were:
- keyword values are restricted to be a single value, not an array
- logical keyword values must consist of a single T or F followed
only by a space or a slash character
- integer and float keyword values must not contain embedded spaces
- complex keyword values must be enclosed in parentheses
- no other keywords may intervene between the mandatory keywords in
the primary array or extension
- the TFORM keyword values must be upper case (e.g., F5.2, not f5.2)
>> There are only 3 proposed new absolute requirements in this list:
>>
>> 1. Keywords that have a value shall not be repeated in a header.
>
> I have many examples (hundreds of thousands?) of files in which
> keywords are repeated. Rather than the wording in the current
> proposal, I would replace the attempt at a requirement with a strong
> recommendation and a clarification that the final copy of any such
> repeated keyword should take precedence.
Imposing a new requirement on software systems to read the last instance
of the keyword would likely have a lot of negative repercussions.
Current software systems produce different results when reading a FITS
file with duplicate keywords. CFITSIO cyclically scans the header for
the next occurrence of the keyword following the last keyword that was
read or written, so the same application may read a different value
depending on exactly what processing was done before hand. I'm sure
other commonly used software systems will always return the first
instance of the keyword, while other systems will always return the last
instance. Requiring all software systems to follow the same behavior is
not practical, so the only sure way to prevent users from getting an
incorrect result when analyzing the file is to eliminate duplicate
keywords in the first place. There is less harm if the duplicated
keywords all have the same value, so maybe the wording of this
requirement should be modified to take this into account.
>> 2. PCOUNT and GCOUNT must immediately follow the last NAXISn
>> keyword in all conforming extensions (as is already required
>> in IMAGE, TABLE, and BINTABLE extensions).
>
> I guess I'd like to know if there are any such extensions.
There are: at least some of your FOREIGN extensions have the order of
these 2 keywords reversed.
>
>> 3. Embedded space characters are now forbidden within numeric
>> values in an ASCII Table (e.g. "1 23 4.5" is no longer
>> allowed to represent the decimal value 1234.5)
>
> Again - are there any examples of such usage in the field?
No, as far as we know. If there are any, then it is very likely that
most current software systems do not support embedded spaces in the
value and will silently read an incorrect value, or will exit with an
error. Thus, it seems better to me to outlaw this usage rather than
just not recommend it or deprecate it.
(...)
>
> And should new dragons appear that the community deems must be slain,
> it does indeed appear to this observer that an explicit version
> keyword (whether a comment or not) should be simultaneously required
> to trigger new conformance restrictions.
I don't really see any practical benefit to having a version keyword.
Either the software will support a new requirement, or it won't; the
presence of a version (or DATE) keyword isn't really helpful, except
maybe to a human reading the header.
> The loose wording about pre-
> existing data is unenforceable since there is no requirement (whether
> or not there ought to be) for a DATE keyword to separate old from
> new. Perhaps the new version tag could itself supply a date - in
> that case, I'd recommend that any revisions of the standard should
> contain explicit references to the date(s) that apply for different
> feature(s).
The proposed new statement ("Existing FITS files that conformed to the
latest version of the standard at the time the files were created are
expressly exempt from any new requirements imposed by subsequent
versions of the standard.") is, I think, mainly intended as a political
statement to reassure institutions that the FITS committees are not
imposing new unfunded mandates that require modifications to existing
FITS archives. I don't see this statement as having much relevance to
the way software is implemented.
Bill Pence
--
____________________________________________________________________
Dr. William Pence pence at milkyway.gsfc.nasa.gov
NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice)
Greenbelt MD 20771 +1-301-286-1684 (fax)
More information about the fitsbits
mailing list