[wfc] CHECKSUM Proposal

Arne Henden aah at nofs.navy.mil
Sun Nov 17 12:43:02 EST 2002


William Pence wrote:
> The objectors to the CHECKSUM proposal (see below) have argued that random
> transcription errors are so rare that there is no need for an additional
> layer of file validity checking at the FITS file level.  This overlooks what
> I regard as one of the most important uses of the CHECKSUM keywords: The
> CHECKSUM keywords provide a simple mechanism for putting a 'validity stamp'
> or 'seal of approval' on the FITS data files that are retrieved from large
> public archives like the HEASARC, MAST, or NRAO.  By verifying that the
> CHECKSUM in the FITS file is still correct, the end user can be reasonably
> assured that the local file is identical to the file in the archive, and
> that it has not been modified (either deliberately, or inadvertently) by
> subsequent data analysis software.  For this purpose, it is actually better
> if most data analysis software ignores the CHECKSUM keywords and does not
> automatically update or delete them.  Then the fact that the file fails the
> checksum test will clearly indicate that the current file is not the same as
> the file that was retrieved from the archive.  Failing the CHECKSUM test
> does not mean the file is not a valid FITS file.  It only means that the
> file is no longer the same as when the CHECKSUM was originaly computed.
> 
> -Bill Pence

After reading the proposal, the concept and proposed implementation
of the checksum is reasonable.  My objection is to the necessity of
checksums at all.  Certainly file integrity can be guaranteed better
with external controls, along with possible correction as well as
error detection.  As opposed to Bill and Archie, I don't see where
the addition of a checksum guards against file changes.  If it is
a malicious change, the malicious programmer can modify the checksum to
agree with the new file, as mentioned in the proposal.  Downloading files
from an agreed-upon public archive site had better be correct without the
user having to check for changes!  If one is worried about obtaining files
from some secondary source and whether they match the original archive,
that is always a problem and is usually under the "user beware"
umbrella.  If absolute integrity is important, the user should go
back to the original source.  Most image processing packages like
iraf and Mira insert comments whenever the file is modified under
most circumstances.
   So what I am saying is that the current proposal is fine, if
checksumming is deemed necessary, but I would rather not have the
Standard extended and made more complex with something of minimal value.
If the proposal is accepted, then I would suggest a third keyword that
indicates the type of checksumming used since there are several
varieties and the authors suggest that theirs may be superceded
in the future.
Arne




More information about the wfc mailing list