[fitsbits] start of Public Comment Period on the CHECKSUM convention

Mark Calabretta mark at calabretta.id.au
Thu Jul 2 06:14:48 EDT 2015


On Fri, 26 Jun 2015 14:32:40 -0700
Rob Seaman <seaman at noao.edu> wrote:

>The problem here is that one might want to reproduce a verbatim
>file at a later date and the timestamp makes this impossible since the
>checksum will differ precisely because of the timestamp.

That's not the way I read it.  If the HDU remains unchanged, then
CHECKSUM and the date when it was computed should remain unchanged,
as also DATE.

One reason for recording the checksum date relates to the problem
described by Richard van Nieuwenhoven:

>A very very simplified example: There is a reader out there that will
>just correct some special value in a fits file, but it does not support
>the CHECKSUM. If that tool is used on a fits file with a checksum, 2
>things will be broken. 1. it is not know is the checksum was correct
>before the change and 2. afterwards the checksum is broken... So the
>user hast to know that the fits-file has a CHECKSUM and that the tool
>does not support it ....

There are two possibilities when validating an HDU:

A) The HDU's checksum equals -0.

   Congratulations, report that the HDU was validated.  This should
   happen in the great majority of cases.


B) The HDU's checksum does not equal -0.

   Look at
     a) the date the checksum was computed,
     b) the date the HDU was written, as recorded in the DATE keyvalue.

   If (a) is earlier than (b) then it is reasonably safe to assume that
   the HDU was modified by naive software without updating CHECKSUM.
   Issue a warning that the CHECKSUM appears to be unreliable.

   If (a) is later than, or the same as (b), or if DATE is missing,
   then there could be a problem.  Possibly the HDU was modified and
   rewritten by slack software that didn't update DATE, or possibly it
   really was corrupted. Issue a warning that the HDU was not validated.
   It's then up to a human to decide what to do based on the provenance
   of the FITS file.  In most cases this should be straightforward.

However, because metadata should not be stored in the comment field,
I would alter the CHECKSUM proposal to create separate keywords, say
DATE-CHK and DATE-DSM, for the CHECKSUM and DATASUM dates.  The form of
these keywords leverages on the following from Sect. 4.4.2.2 of FITS 3.0:

  DATExxxx keywords. The value fields for all keywords beginning
  with the string DATE whose value contains date, and optionally
  time, information shall follow the prescriptions for the
  DATE-OBS keyword.


Regarding the proposed text to be added in Sect. 4.4.2.8.  Prima facie,
the following is incorrect:

  [Verifying that the accumulated checksum is still equal to -0 provides
   a fast and fairly reliable way to determine that] the HDU has not
   been modified by subsequent data processing operations...

Because obviously CHECKSUM could have, and should have been recomputed
after those operations were performed.

Also, all references to "file" in the first paragraph of 4.4.2.8 should
be "HDU".  Likewise for various occurences in Appendix J.


Regards,
Mark Calabretta



More information about the fitsbits mailing list