[fitsbits] Proposed Changes to the FITS Standard

William Pence pence at milkyway.gsfc.nasa.gov
Fri Aug 17 15:55:47 EDT 2007


Rob Seaman wrote:
> Bill said:
> 
>> The "once FITS always FITS" philosophy captures the spirit of FITS,  
>> but
>> in practice each new version of the FITS Standard has imposed new
>> requirements that in principle could invalidate existing FITS files.
>> For example, version 2.0 of the FITS Standard introduced a new
>> requirement that the value and comment fields in a keyword MUST be
>> separated by a slash character.
> 
> It would be interesting to review past such instances.  I don't  
> personally recall changes of this mandatory nature.  The example  
> regarding comments is pretty tame since any reasonable implementation  
> would already be ignoring the comments.  Do you have another example  
> to quote?

Some other new requirements were:

- keyword values are restricted to be a single value, not an array
- logical keyword values must consist of a single T or F followed
   only by a space or a slash character
- integer and float keyword values must not contain embedded spaces
- complex keyword values must be enclosed in parentheses
- no other keywords may intervene between the mandatory keywords in
   the primary array or extension
- the TFORM keyword values must be upper case (e.g., F5.2, not f5.2)

>>   There are only 3 proposed new absolute requirements in this list:
>>
>>   1. Keywords that have a value shall not be repeated in a header.
> 
> I have many examples (hundreds of thousands?) of files in which  
> keywords are repeated.  Rather than the wording in the current  
> proposal, I would replace the attempt at a requirement with a strong  
> recommendation and a clarification that the final copy of any such  
> repeated keyword should take precedence.

Imposing a new requirement on software systems to read the last instance 
of the keyword would likely have a lot of negative repercussions. 
Current software systems produce different results when reading a FITS 
file with duplicate keywords.  CFITSIO cyclically scans the header for 
the next occurrence of the keyword following the last keyword that was 
read or written, so the same application may read a different value 
depending on exactly what processing was done before hand.  I'm sure 
other commonly used software systems will always return the first 
instance of the keyword, while other systems will always return the last 
instance.  Requiring all software systems to follow the same behavior is 
not practical, so the only sure way to prevent users from getting an 
incorrect result when analyzing the file is to eliminate duplicate 
keywords in the first place.  There is less harm if the duplicated 
keywords all have the same value, so maybe the wording of this 
requirement should be modified to take this into account.

>>   2. PCOUNT and GCOUNT must immediately follow the last NAXISn
>>      keyword in all conforming extensions (as is already required
>>      in IMAGE, TABLE, and BINTABLE extensions).
> 
> I guess I'd like to know if there are any such extensions.  

There are: at least some of your FOREIGN extensions have the order of 
these 2 keywords reversed.

> 
>>   3. Embedded space characters are now forbidden within numeric
>>      values in an ASCII Table (e.g.  "1 23 4.5"  is no longer
>>      allowed to represent the decimal value 1234.5)
> 
> Again - are there any examples of such usage in the field?

No, as far as we know.  If there are any, then it is very likely that 
most current software systems do not support embedded spaces in the 
value and will silently read an incorrect value, or will exit with an 
error.  Thus, it seems better to me to outlaw this usage rather than 
just not recommend it or deprecate it.

(...)

> 
> And should new dragons appear that the community deems must be slain,  
> it does indeed appear to this observer that an explicit version  
> keyword (whether a comment or not) should be simultaneously required  
> to trigger new conformance restrictions.  

I don't really see any practical benefit to having a version keyword. 
Either the software will support a new requirement, or it won't; the 
presence of a version (or DATE) keyword isn't really helpful, except 
maybe to a human reading the header.

> The loose wording about pre- 
> existing data is unenforceable since there is no requirement (whether  
> or not there ought to be) for a DATE keyword to separate old from  
> new.  Perhaps the new version tag could itself supply a date - in  
> that case, I'd recommend that any revisions of the standard should  
> contain explicit references to the date(s) that apply for different  
> feature(s).

The proposed new statement ("Existing FITS files that conformed to the 
latest version of the standard at the time the files were created are 
expressly exempt from any new requirements imposed by subsequent 
versions of the standard.") is, I think, mainly intended as a political 
statement to reassure institutions that the FITS committees are not 
imposing new unfunded mandates that require modifications to existing 
FITS archives.  I don't see this statement as having much relevance to 
the way software is implemented.

Bill Pence
-- 
____________________________________________________________________
Dr. William Pence                       pence at milkyway.gsfc.nasa.gov
NASA/GSFC Code 662       HEASARC        +1-301-286-4599 (voice)
Greenbelt MD 20771                      +1-301-286-1684 (fax)





More information about the fitsbits mailing list