[wfc] Re: Next: Checksum proposal (revisited)

Rob Seaman seaman at noao.edu
Thu Nov 14 13:59:16 EST 2002


Arne says:

> After handling several hundred thousand CCD frames, the only ones
> I found corrupted were *really* corrupted and were obvious.

And after handling several million CCD and IR frames, checksum
verification has left me quite confident that those files were not
corrupted.  A checksum is precisely a mechanism for making subtle
corruptions more obvious.  Files that pass verification benefit from
the added level of trust - an added value perhaps even greater than
the detection of the rare failures.

Absence of evidence is not evidence of absence.

> Files that are transmitted over sockets or networks can take
> advantage of their retransmittal capability to handle transfer
> errors;

Few communities transfer as many large data files over the internet
as astronomy.  The web and the internet are tuned for transfers in
the kilobyte range - not files of hundreds of megabytes.  The IP
checksum - also a 1's complement algorithm - is 16 bits.  The FITS
checksum is 32 bits.  Surely files hundreds or thousands of times
as large could benefit from a larger hash space?

> all hard drives are pretty robust these days, and if errors are
> important, can be RAIDed.

"Pretty robust" is a rather imprecise measure.  RAID is one tool
to help ensure detection and correction of errors.  RAID failures
are not unknown.  A checksum is another layer of protection.

FITS is also an archival format - not simply a transfer format.
Many FITS data are stored offline on tape or other media.  

> So error handling should be at the system level, not the internal
> file level.

Error handling is a useful facility at all levels of the systems
we build.  Many of us use FITS precisely because of its utility
in constructing systems.  The development of our data handling
systems should benefit from the same kinds of tools that others
take for granted.

On the other extreme, our data are often presented to end users with no
provenance beyond the metadata in the FITS headers.  The FITS checksum
is a modest investment in extending tools to the user community to
directly verify the integrity of our data products - to verify the
integrity now, and the integrity at some random instant in the future.
Say - when a grad student finally gets around to analyzing derived
results from our tabular and imaging data sets.  In the intervening
time, the pristine archival data sets from our carefully climate
controlled archives may have been manhandled by the users in outrageous
ways - our FITS HDUs may have been copied and recopied again and again,
perhaps through operating systems and applications that have yet to
be designed.

We can't simply abdicate all responsibility for the continuing
integrity of our data.  If the mechanism for enforcing that integrity
is not carried within the FITS format - well, there is no mechanism.

> Adding checksums is just one more piece of baggage that a fits
> programmer has to handle, and has very little value in the modern
> world.  While it may be important to one subgroup, I don't think
> a checksum proposal is appropriate for a general Standard.

If a FITS project or programmer doesn't find the checksum useful,
they can ignore it.  Ignore it completely, if it is a question of
creating original data products or of reading other people's data.
Ignore to the extent of simply always deleting the CHECKSUM and
DATASUM keywords, if it is a question of producing new copies of
data products.

I can't imagine a lower effort addition to the standard.

On the other hand, the additional overhead for calculating the
checksum is as small for the 1's complement algorithm as for any
possible alternative - each bit in the file is added to the total.
For example, this overhead is so slight that every single packet
traversing the internet contains a 1's complement checksum.

The checksum is a worthy addition to any project for any "subgroup"
to consider.

> This is not to say anything about the quality of the proposal,
> just that a solution valid 7yrs ago may not be valid today.

The 1's complement arithmetic has not changed over that time.

It is true, however, that FITS should consider implementing a digital
signature addition to the standard at some point in the future.  That
is a subject worthy of deep discussion, but it is likely that we can
adopt the results of work going on in the XML or other communities
when the time comes.

But a checksum is a different, much simpler, issue.  The entire point
of the 1's complement algorithm is to allow embedding a checksum in a
file.  The logic behind this algorithm is proven every microsecond by
network routers around the world.

The checksum is already valid FITS.  It is a de facto standard used
by several large FITS sites.  There may be a technical argument for
not adopting the proposal as it currently stands.  Can there really
be any philosophical argument to reject the proposal?

Rob



More information about the wfc mailing list