[fitsbits] Proposed Changes to the FITS Standard

Tue Aug 21 13:43:26 EDT 2007

Lucio Chiappetti wrote:

>  - a keyword is intended as a named resource to be mainly read by
>    software, maybe into a variable, and then be acted upon (all the
>    mandatory and WCS keywords, those defined by specific conventions,
>    etc.)
>
>  - a keyword just records some information associated to a file, which
>    is intended to be read by a human, but it is hardly relevant to any
>    software (essentially "commentary" keywords).

I'd suggest FITS keywords fall into three categories:

	1) FITS metadata, that is "data about FITS data" - examples start  
with the mandatory keywords, SIMPLE, XTENSION, BITPIX, NAXISn,  
PCOUNT, GCOUNT, but also CHECKSUM and DATASUM, etc.

	2) Science metadata, that is "data about the data represented within  
the HDU or file" - examples are DATE-OBS, EXPTIME, the slew of WCS  
keywords, etc.

	3) Provenance - this may be purely commentary including COMMENTs and  
HISTORY, but may also be contained in keywords with values, but the  
point is that it doesn't describe the file as it is, but rather, how  
it came to be.  The most obvious here is DATE.

One can make these distinctions finer grained - for instance INHERIT  
is meta-science-metadata - but it isn't clear how useful that is  
likely to be.

>   DUPKWDS = 'none'       assures that the FITS file was written  
> without
>                          any duplicated keywords
>
>   DUPKWDS = 'ignore'     (or 'comments') declares that duplicated
>                          keywords are of commentary nature, so they  
> can
>                          be ignored by s/w or dealt with as HISTORY or
>                          COMMENTS
>
>   DUPKWDS = 'take_first' declare that only the first or last value
>   DUPKWDS = 'take_last'  shall be considered
>
>   DUPKWDS = 'concatenate' declare (string) values wanting to be
>                           concatenated (also numeric arrays ??)
>
> Any other cases possible ?

I suspect most will think we're reaching diminishing returns.  If we  
can't reach consensus on whether the first or last instance should  
take precedence then "indeterminate" it will have to be.  I'm still  
interested to hear of cases where the duplicates are intentional.   
Perhaps these would be addressed better through some other mechanism  
than duplication?

> But even with such conventions, we are still left with the problem of
> what a generic reader should do with (older or not) files not  
> following
> any convention.

What is this generic reader people keep talking about?  Data is only  
ever read for some purpose.  If the purpose is to display the header  
to a human, then display both copies of duplicate keywords.  If the  
purpose is to semantically capture the value of such a keyword, INDEF  
seems appropriate (and we would do our users a favor to clarify the  
standard to say so).  If the purpose is to copy the input to the  
output, copy it verbatim.  If the purpose is to validate the data  
structures, throw a warning if you want on detecting a duplicate  
keyword - just don't throw an error.  But if it is one of the key  
structural keywords, there is no need to clarify the standard to know  
to throw a big, fat, juicy error, e.g., duplicating BITPIX calls the  
parsing of the file into question.  Beneath every standard lies a  
bedrock of logic.

A nod is as good as a wink to a blind horse.

Rob