[fitsbits] start of Public Comment Period on the CHECKSUM convention

Lucio Chiappetti lucio at lambrate.inaf.it
Mon Jul 6 06:20:11 EDT 2015


On Sun, 5 Jul 2015, Rob Seaman wrote:

> My use case is different.  Many large-data projects have proposed 
> variations of duplicating processing at remote sites rather than 
> processing at one site and transporting the results to another.  The 
> project I was describing involved a few million files that had 
> previously been replicated between two sites connected via an expensive 
> network link.  A sequence of steps was necessary to update the pixels 
> (happened to be a new compression algorithm) and headers for both data 
> stores.  The goal was to produce identical output files at each end.  I 
> found I needed to disable the timestamp in the CFITSIO CHECKSUM to make 
> the copies verbatim.

I never embarked myself in projects of such size, and in general I did not 
care much of checksums even when downloading data from ftp sites.

However concerning identical copies of files, I like the idea of them 
having the same timestamps. For instance for mirroring a development web 
site (where all pages are timestamped, and the timestamp shown via a SSI 
include directive) into a production one.

The tool I use for this is rsync.

I wonder whether there is a usage case for a specialized rsync for FITS 
(maybe in conjunction with compression), something acting at HDU instead 
of file level ...



More information about the fitsbits mailing list