[fitsbits] start of Public Comment Period on the CHECKSUM convention

Arnold Rots arots at cfa.harvard.edu
Tue Jun 30 12:45:58 EDT 2015


I agree that writing date and time should be discouraged.

It occurred to me, though, that the *values* of DATASUM and CHECKSUM
provide a reasonable level of authenticity check, if their values are
provided
through a separate registry. If you know the canonical values of these
keywords
for each HDU in a file, you can calculate the same for the file at hand and
compare
them against each other.
The point is that the last sentence of the first paragraph could be phrased
a bit
more nuanced.
Of course, a more fullproof method is to maintain a public registry of SHA
digests
for each file and allow users to recalculate them for a file at hand. I
implemented
that for the RXTE archive, but doubt anyone ever used that feature.

Cheers,

  - Arnold

-------------------------------------------------------------------------------------------------------------
Arnold H. Rots                                          Chandra X-ray
Science Center
Smithsonian Astrophysical Observatory                   tel:  +1 617 496
7701
60 Garden Street, MS 67                                      fax:  +1 617
495 7356
Cambridge, MA 02138
arots at cfa.harvard.edu
USA
http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------------------------------------------


On Fri, Jun 26, 2015 at 10:15 PM, THIERRY FORVEILLE <
thierry.forveille at ujf-grenoble.fr> wrote:

> > A second issue arises with:
> >
> >       "It is recommended that the current date and time be written into
> the
> >       comment field of both keywords to document when the checksum was
> computed
> >       (or more precisely, the time that the checksum computation process
> was
> >       started).”
> >
> > The problem here is that one might want to reproduce a verbatim file at a
> > later date and the timestamp makes this impossible since the checksum
> will
> > differ precisely because of the timestamp.  For instance, one might (one
> > has, in fact) generate a large number of files to ingest into one copy
> of an
> > archive in one location, and regenerate the same files to ingest into a
> > second copy.  Due to the large amount of data it is less expensive to
> > duplicate the processing compared to copying the data remotely.  The
> > timestamp should be optional.
> >
> I'd go one step farther and say that it should be discouraged, as the con
> which you mention is significant and I cannot see any pros to offset it.
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/fitsbits/attachments/20150630/b053a051/attachment.html>


More information about the fitsbits mailing list