[fitsbits] Fwd: Re: multiple keyword occurance in header

Tue Aug 21 07:37:33 EDT 2007

Two (smallish) problems with 
HISTORY text 1
HISTORY text 2 ...

1. If the original texts+comments fill the full 80 byte card
   image, the last 8 characters, usually comments, will be lost.

2. HISTORY records are, after the initial 8 bytes, free format.
   If we don't use the BLOCK HISTORY construction (I think this
   is a terrible suggestion of mine for keywords, 
   but that can be fixed later), then we will have mixed series
   of "real" HISTORY records (thus free format) and "BLOCK"
   history records, which are strictly formatted according to
   FITS standards (e.g. must be KEYWORD= or COMMENT or HISTORY after
   the initial HISTORY keyword).  If an interpreter wants to unpack
   the BLOCK records (maybe LITERAL, or NAMESPACE are  better words, 
   see below) without having the special BEGIN/END as clues, it
   will run into unparseable standard HISTORY records.

>Well, it's a good thought that we might want to understand why issues  
>like duplicate keywords occur before making changes to the standard.   
>But I'm not sure I see the advantage of:
>
>	BLOCK HISTORY BEGIN
>	text 1
>	text 2
>	text 3
>	BLOCK HISTORY END
>
>to
>
>	HISTORY text 1
>	HISTORY text 2
>	HISTORY text 3
>
>other than gaining 8 characters per line.
>
Clearly I was naive in expecting that repeated keywords come mostly
from history concatenations.  As stated below by Rob, it becomes clear
that the main problem is a namespace problem: different input streams
use the same keyword with different meanings.  ESO "solves" this problem
by using a HEIRARCHal keyword structure, that is nonstandard and
restricted to ESO applications.  I speculate on a slightly grander 
scheme than BLOCK HISTORY, which is something like:
NAMESPACE BEGIN 
NAME=
DATE=
KEYWORD1=
KEYWORD2=
NAMESPACE END

Which defines a piece of the header stream with a single consistent
use of keywords.  The convention/standard/recommendation would be
that within one namespace, keywords must be unique.  If the
reading program wants to distinguish between uses of the same
keyword it can try to find the correct NAMESPACE if knows which
one it wants, or can search on the basis of date.  If it has
no information as to which NAMESPACE is the best, it should
regard the keyword as undefined.

Walter

>In any event, sequential header edits is only one way duplicates can  
>arise.  NOAO iSTB grabs raw data from a couple of dozen different  
>instruments.  These raw data often demonstrate a wide variety of  
>"interesting" metadata, with just one example being duplicate  
>keywords.  I would attribute this to an interaction between various  
>groups (who often work for different organizations):  the instrument  
>engineers, the data acquisition programmers, the telescope control  
>system, the data handling system - to name a few.  There often is no  
>one final arbiter of the content of the header - and this issue is  
>rarely at the top of the list for anybody before the data reach the  
>archivists.  In fact, an archive is an instrumentality for asserting  
>coherent standards - but it requires a bit more subtlety than simply  
>telling everybody up stream that something is verboten.
>
>Rob