[fitsbits] start of Public Comment Period on the CHECKSUM convention
Arnold Rots
arots at cfa.harvard.edu
Tue Jun 30 12:45:58 EDT 2015
I agree that writing date and time should be discouraged.
It occurred to me, though, that the *values* of DATASUM and CHECKSUM
provide a reasonable level of authenticity check, if their values are
provided
through a separate registry. If you know the canonical values of these
keywords
for each HDU in a file, you can calculate the same for the file at hand and
compare
them against each other.
The point is that the last sentence of the first paragraph could be phrased
a bit
more nuanced.
Of course, a more fullproof method is to maintain a public registry of SHA
digests
for each file and allow users to recalculate them for a file at hand. I
implemented
that for the RXTE archive, but doubt anyone ever used that feature.
Cheers,
- Arnold
-------------------------------------------------------------------------------------------------------------
Arnold H. Rots Chandra X-ray
Science Center
Smithsonian Astrophysical Observatory tel: +1 617 496
7701
60 Garden Street, MS 67 fax: +1 617
495 7356
Cambridge, MA 02138
arots at cfa.harvard.edu
USA
http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------------------------------------------
On Fri, Jun 26, 2015 at 10:15 PM, THIERRY FORVEILLE <
thierry.forveille at ujf-grenoble.fr> wrote:
> > A second issue arises with:
> >
> > "It is recommended that the current date and time be written into
> the
> > comment field of both keywords to document when the checksum was
> computed
> > (or more precisely, the time that the checksum computation process
> was
> > started).”
> >
> > The problem here is that one might want to reproduce a verbatim file at a
> > later date and the timestamp makes this impossible since the checksum
> will
> > differ precisely because of the timestamp. For instance, one might (one
> > has, in fact) generate a large number of files to ingest into one copy
> of an
> > archive in one location, and regenerate the same files to ingest into a
> > second copy. Due to the large amount of data it is less expensive to
> > duplicate the processing compared to copying the data remotely. The
> > timestamp should be optional.
> >
> I'd go one step farther and say that it should be discouraged, as the con
> which you mention is significant and I cannot see any pros to offset it.
>
> _______________________________________________
> fitsbits mailing list
> fitsbits at listmgr.nrao.edu
> https://listmgr.nrao.edu/mailman/listinfo/fitsbits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listmgr.nrao.edu/pipermail/fitsbits/attachments/20150630/b053a051/attachment.html>
More information about the fitsbits
mailing list